Methods and reagents for molecular cloning

ABSTRACT

The present invention provides compositions, methods, and kits for covalently linking nucleic acid molecules. The methods include a strand invasion step, and the compositions and kits are useful for performing such methods. For example, a method of covalently linking double stranded (ds) nucleic acid molecules can include contacting a first ds nucleic acid molecule, which has a topoisomerase linked to a 3′ terminus of one end and has a single stranded 5′ overhang at the same end, with a second ds nucleic acid molecule having a blunt end, such that the 5′ overhang can hybridize to a complementary sequence of the blunt end of the second nucleic acid molecule, and the topoisomerase can covalently link the ds nucleic acid molecules. The methods are simpler and more efficient than previous methods for covalently linking nucleic acid sequences, and the compositions and kits facilitate practicing the methods, including methods of directionally linking two or more ds nucleic acid molecules.

This application claims priority under 35 U.S.C. §119 to U.S. Ser. No.60/226,563, filed Aug. 21, 2000, the entire content of which isincorporated herein by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention relates generally to compositions and methods forfacilitating the construction of recombinant nucleic acid molecules, andmore specifically to compositions useful for covalently linking two ormore nucleic acid molecules, including for directionally ornon-directionally linking the nucleic acid molecules, and to methods ofgenerating such covalently linked recombinant nucleic acid molecules.

2. Background Information

The ability to clone large numbers of nucleotides sequences, includinggene sequences and open reading frames allows a great deal ofinformation to be obtained about gene expression and the regulationthereof. In addition, such sequences can be useful for understanding theetiology of disease conditions and, ideally, can provide a means todiagnose and treat such diseases. However, while it is relatively simplematter to clone large numbers of expressed nucleotides sequences, forexample, it is a more difficult undertaking to characterize theregulatory elements involved in the expression of such sequence and toproperly express a polypeptide encoded by the sequence. In particular,there is a need for improved methods for ligating nucleic acid moleculesand cloning nucleic acid molecules such that a functional recombinantnucleic acid molecule is produced. There is a particular need fordirectional cloning methods, wherein an insert can be cloned into avector or linked to one or more other nucleic acid molecules in apredetermined orientation.

The use of topoisomerases provides a convenient means to improve cloningand ligation methods. For example, the use of topoisomerase to performrapid ligation of polymerase chain reaction (PCR) products into aerector has cut traditionally laborious cloning methods down to a fiveminute procedure. As such, topoisomerase is particularly useful for highthroughput cloning applications. However, given the current demand forexpressing open reading frames (ORF) in genome scale molecular cloningprocedures, there still remains a need to better control the orientationin which two or more nucleic acid molecules are linked such thatfunctional recombinant nucleic acid molecules such as expressible clonednucleic acid molecules can be prepared.

Expression of cloned ORFs demands that the PCR product be inserted intothe vector in its correct orientation, so as to work in accord withfunctional expression domains located on the vector. In the currentstate of the art for topoisomerase mediated cloning, ORFs are amplifiedby PCR using various DNA polymerases. A polymerase such at Taq, whichdoes not have a proof-reading function and has an inherent terminaltransferase activity, is commonly used, and produces PCR productscontaining a single, non-template derived 3′ A overhang at each end.These amplification products can be efficiently cloned intotopoisomerase-modified vectors containing a single 3′ T overhang at eachend (TOPO TA Cloning® Kit, Invitrogen Corp., Carlsbad, Calif.). Incomparison, a polymerase such as pfu, which has an inherent 3′ to 5′exonuclease proof-reading activity, produces PCR products that areblunt-ended. Topoisomerase-modified vectors containing blunt ends areavailable for cloning of PCR products produced with proofreadingpolymerases (Zero Blunt TOPO® PCR Cloning Kit, Invitrogen Corp.,Carlsbad, Calif.). Incubation of either PCR product and the propertopoisomerase-modified vector results in five minute ligation. However,the orientation of the insert obtained using such cloning methods israndom.

Because the orientation of DNA fragment insertion intotopoisomerase-modified cloning vectors is random, users must screenclones to identify those having the proper orientation. Insertorientation can be determined using various methods including, forexample, restriction enzyme analysis, in vitro transcription fromvector-encoded promoter elements, and PCR using, for example, oneinsert-specific primer and one vector-specific primer. As is evident,however, the requirement for determining insert orientation requires aninvestment of time and can substantially increase the cost foridentifying a nucleic acid molecule of interest, particularly where ahigh throughput cloning method is used. As such, current cloning methodsare severely limited, particularly for high throughput gene expressionanalysis for several reasons, because numerous laborious steps must beperformed in order to select clones with correctly oriented inserts, andthere is a need to screen as many as eight colonies of each clone toidentify one having the proper orientation. Thus, a need exists formethods and reagents that are useful for covalently linking two or morenucleic acid molecules in a directional orientation. The presentinvention satisfies this need and provides additional advantages.

SUMMARY OF THE INVENTION

The present invention provides compositions and methods for covalentlylinking two or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.) doublestranded (“ds”) nucleic acid molecules, including directionally ornon-directionally linking two or more ds nucleic acid molecules. Nucleicacid molecules used in accordance with the invention preferably comprisea first end and a second end. The first and/or second end of suchmolecules preferably has a 5′ and/or 3′ extension or overhang. Thus, oneor both ends of the nucleic acid molecules used in the invention canhave a 3′ and/or 5′ overhang. The overhang sequences can be the same ordifferent sequences, and can be the same or different types (e.g., 3′ or5′ overhang) at both ends of the molecule. In addition, while one end ofthe nucleic acid molecule can have a 3′ extension and 5′ extension, theother end of the molecule can, but need not, have an extension. In someaspects, one end of nucleic acid molecule can contain a 3′ overhang or5′ overhang while the other end can be blunt ended (i.e., it has nooverhang). In accordance with the invention, the 3′ and/or 5′ extensionsequences (i.e., overhangs) at any terminus can be any length (i.e., anynumber of nucleotides), and can have any sequence. Thus, the inventionrelates to nucleic acid molecules having single or multiple nucleotideoverhangs. In some aspects, the nucleic acid molecules and their terminican include modified or labeled nucleotides. In the use of theinvention, enzymes or proteins capable of fusing or joining or ligatingnucleic acid molecules can be used. Thus, two or more nucleic acidmolecules, which can be the same or different, can be joineddirectionally using such enzymes. Such enzymes or proteins include, butare not limited to, topoisomerases (including types IA, IB, II, etc.),recombinase proteins (including FLIP recombinase, Int integrase, crerecombinase, etc.), and ligases (including T4 DNA ligase, etc.).

In the methods of the invention, the 3′ or 5′ overhang of one terminusof the first nucleic acid molecule can have homology (or iscomplementary) to at least one sequence at or near the terminus of atleast a second nucleic acid molecule. Thus, through base pairing orhybridization of the 3′ or 5′ overhang or extension with the homologousor complementary sequence on the second molecule, the invention allowsdirectional or non-directional association or joining of two differentmolecules. In a preferred aspect, the 3′ or 5′ overhang of one terminusof at least a first molecule can engage in strand invasion as itassociates or hybridizes with its complementary sequence at or near theterminus of the second molecule. In one aspect, such a strand invasionevent allows the 3′ or 5′ overhang to directionally associate with adesired end of its partner molecule. By designing the overhangs and thetermini of the molecules to be joined, two or multiple partner moleculescan be joined in the presence of one or more proteins or enzymes havingligase activity (e.g., topoisomerases, ligases, recombinases, etc.) inaccordance with the invention. Thus, the invention provides methods forconnecting two or more nucleic acid molecules (e.g., double strandednucleic acid molecules) which involve covalently linking at least onestrand of one molecule to at least one strand of another molecule. Theinvention further provides compositions for preparing nucleic acidmolecules connected by methods of the invention and compositionsproduced by methods of the invention.

Processes of the invention are exemplified by methods described hereinwhich involve the covalent linkage of strands of different nucleic acidmolecules catalyzed by topoisomerase. Thus, the present inventionrelates, in part, to an isolated ds nucleic acid molecule having a firstend and a second end, wherein the first end contains a first 5′ overhangand a first topoisomerase covalently bound at the 3′ terminus, and thesecond end contains a second topoisomerase covalently bound at the 3′terminus and contains a second 5′ overhang, a blunt end, or a 3′thymidine overhang, wherein the first 5′ overhang is different from thesecond 5′ overhang. The first topoisomerase and second topoisomerase canbe the same or different. The first 5′ overhang can have any nucleotidesequence, including, for example, the nucleotide sequence 5-GGTG-3′.

In one embodiment, the ds nucleic acid molecule is a vector, which canbe a linear vector such as a lambda vector or a linearized vector suchas a linearized plasmid. The vector can be a cloning vector or anexpression vector, and can contain, for example, one or more (e.g., 1,2, 3, 4, 5, 6, etc.) recombinase recognition sites such as one or morelox sites or one or more att sites, one or more transcriptionalregulatory elements, one or more translational regulatory elements, oneor more nucleotide sequences encoding a peptide of interest such as oneor more selectable markers or one or more tags, or combinations thereof.For example, the vector can be a pUni/V5-His version A (SEQ ID NO: 16)vector or a pCR®2.1 (SEQ ID NO:17) vector.

The present invention also relates to methods of directionally ornon-directionally linking two, three, four or more nucleic acidmolecules, including, as desired, operatively linking two or more of thenucleic acid molecules. A method for generating a directionally linkedrecombinant nucleic acid molecule can be performed, for example, bycontacting a first topoisomerase-charged first ds nucleic acid molecule,which has a first topoisomerase covalently bound at a first end, and asecond topoisomerase covalently bound at a second end, and also containsa 5′ overhang at the first end and a blunt end, a 3′ uridine overhang, a3′ thymidine overhang, or a second 5′ overhang at the second end; and atleast a second ds nucleic acid molecule, which has a first blunt end anda second end, wherein the first blunt end has 5′ nucleotide sequencethat is complementary to the first 5′ overhang of the first end of thefirst nucleic acid molecule. The first and second topoisomerases can bethe same, for example, two type IB topoisomerases such as two Vacciniatype IB topoisomerases, or can be different, including two type IBtopoisomerases from different organisms or a type IB topoisomerase and atype IA or a type II topoisomerase.

In performing a method of the invention, the first and second (or other)ds nucleic acid molecules are contacted under conditions such that the5′ nucleotide sequence of the first blunt end of the second nucleic acidmolecule can selectively hybridize to the first 5′ overhang, whereby thefirst topoisomerase can covalently link the 3′ terminus of the first endof the first ds nucleic acid molecule to the 5′ terminus of the firstblunt end of the second ds nucleic acid molecule, and the secondtopoisomerase can covalently link the 3′ terminus of the second end ofthe first ds nucleic acid molecule to the 5′ terminus of the second endof the second ds nucleic acid molecule, to generate a directionallylinked recombinant nucleic acid molecule. Accordingly, the presentinvention provides a directionally or non-directionally linkedrecombinant nucleic acid molecule produced by such a method.

In one aspect of performing a method of the invention, the second end ofthe first topoisomerase-charged ds nucleic acid molecule has a bluntend, and the second end of the second ds nucleic acid molecule has ablunt end. In another aspect, the second end of thetopoisomerase-charged first ds nucleic acid molecule has a 3′ thymidineoverhang, and the second end of the second ds nucleic acid molecule hasa 3′ adenosine overhang, or the second end of the topoisomerase-chargedfirst ds nucleic acid molecule has a 3′ uridine (or modified formthereof, for example, deoxyuridine) overhang, and the second end of thesecond ds nucleic acid molecule has a 3′ adenosine overhang. In yetanother aspect, the topoisomerase-charged first ds nucleic acid moleculehas a second 5′ overhang at the second end, and the second end of thesecond ds nucleic acid has a nucleotide sequence complementary to thesecond 5′ overhang. The topoisomerase-charged first ds nucleic acidmolecule can, but need not be, a vector, including a cloning vector oran expression vector.

A method of the invention can further include introducing adirectionally or non-directionally-linked recombinant nucleic acidmolecule into a cell, which can be a prokaryotic cell such as abacterium or a eukaryotic cell such as a mammalian cell. Accordingly,the present invention also provides a cell produced by a method of theinvention, as well as a non-human transgenic organism produced from sucha cell.

The topoisomerase-charged first ds nucleic acid molecule can be avector, and the second ds nucleic acid molecule used in a method of theinvention can be an amplification product. In addition, the second dsnucleic acid molecule can be one of a plurality of second ds nucleotidemolecules, for example, individual members of a cDNA library or acombinatorial library.

A method for generating a directionally or non-directionally linkedrecombinant nucleic acid molecule also can be performed, for example, bycontacting a first precursor ds nucleic acid molecule having a firstend, which has a first 5′ target sequence at the 5′ terminus and atopoisomerase recognition site at the 3′ terminus, and a second end,which has a topoisomerase recognition site at the 3′ terminus; a secondds nucleic acid molecule having a first blunt end and a second end,wherein the first blunt end has a 5′ nucleotide sequence complementaryto the 5′ target sequence of the first precursor ds nucleic acidmolecule; and a topoisomerase that is specific for the topoisomeraserecognition site. The first ds nucleic acid molecule, second ds nucleicacid molecule and topoisomerase are contacted under conditions thatallow topoisomerase activity, i.e., such that the topoisomerase can bindto and cleave the recognition site, to produce a topoisomerase-charged3′ terminus, and can ligate the 3′ terminus to an appropriate 5′terminus. Such conditions also allow hybridization of the portion of thefirst 5′ target sequence that remains following cleavage by thetopoisomerase and the 5′ nucleotide sequence of the first blunt end ofthe second ds nucleic acid molecule, wherein the 5′ nucleotide sequenceof the first blunt end is complementary to that portion of the 5′ targetsequence.

In one aspect of performing a method of the invention, the second end ofthe first precursor ds nucleic acid molecule is a blunt end uponcleavage by the topoisomerase, and the second end of the second dsnucleic acid molecule is a blunt end. In another aspect, the second endof the first precursor ds nucleic acid molecule has a 3′ thymidineextension upon cleavage by the topoisomerase, and the second end of thesecond ds nucleic acid molecule comprises a 3′ adenosine or 3′-uridine,for example, deoxyuridine overhang. In yet another aspect, the firstprecursor ds nucleic acid molecule has a second 5′ target sequence atthe second end, and the second end of the second ds nucleic acidmolecule has a 5′ nucleotide sequence complementary to at least aportion of the second 5′ target sequence.

The first precursor ds nucleic acid molecule can be a vector, includinga cloning vector and an expression vector, and, where the vectorgenerally is available in a circular form, can be linearized due to theaction of the topoisomerase, or can be linearized by including, forexample, one or two restriction endonucleases that linearize the vectorsuch that, upon contact with the topoisomerase, the first and second dsnucleic acid molecules can be directionally or non-directionally linkedaccording to a method of the invention. The present invention alsoprovides a directionally or non-directionally linked recombinant nucleicacid molecule produced according to a method of the invention, which canfurther include, for example, a step of introducing thedirectionally-linked recombinant nucleic acid molecule into a cell.Accordingly, the present invention also provides a cell containing sucha directionally or non-directionally linked recombinant nucleic acidmolecule, as well as a transgenic non-human organism generated from sucha cell.

The first precursor ds nucleic acid molecule can include one or more(e.g., 1, 2, 3, 4, 5, 6, 7, etc.) expression control elements, which canbe operatively linked to each other, and the second ds nucleic acidmolecule can encode all or a portion of an open reading frame, whereinthe expression control element is operatively linked to the open readingframe in a directionally linked recombinant nucleic acid moleculegenerated according to a method of the invention. In addition, thesecond ds nucleic acid molecule can be one of a plurality of second dsnucleic acid molecules, for example, individual members of a cDNAlibrary.

A method for generating a directionally linked recombinant nucleic acidmolecule also can be performed by contacting a topoisomerase-chargedfirst ds nucleic acid molecule, which has, at a first end, a first 5′overhang and a first topoisomerase covalently bound to the 3′ terminus,and a second ds nucleic acid molecule, which has a first blunt end and asecond end, wherein the first blunt end includes a 5′ nucleotidesequence complementary to the first 5′ overhang. The method is performedunder conditions such that the 5′ nucleotide sequence of the first bluntend can selectively hybridize to the first 5′ overhang, whereby thefirst topoisomerase can covalently link the 3′ terminus of the first endof the first ds nucleic acid molecule with the 5′ terminus of the firstend of the second ds nucleic acid molecule.

Such a method can further include contacting the topoisomerase-chargedfirst ds nucleic acid molecule and the second ds nucleic acid moleculewith a third ds nucleic acid molecule, wherein a first end of the thirdnucleic ds acid molecule has a 5′ overhang and a second topoisomerasecovalently bound at the 3′ terminus, and wherein the second ds nucleicacid molecule has a second blunt end, which includes a 5′ nucleotidesequence complementary to the second 5′ overhang. The contacting can beperformed, for example, under conditions such that the 5′ nucleotidesequence of the second blunt end of the second ds nucleic acid canselectively hybridize to the 5′ overhang of the first end of the thirdds nucleic acid molecule, whereby the second topoisomerase cancovalently link the 3′ terminus of the first end of the third ds nucleicacid molecule with the 5′ terminus of the second blunt end of the secondds nucleic acid molecule. Similarly, the method can be used todirectionally or non-directionally link a fourth, fifth, sixth, or moreds nucleic acid molecules, wherein the ends of such ds nucleic acidmolecules are selected as exemplified herein. The first and second (orother) topoisomerases can be the same or different and, if desired, thefirst or third ds nucleic acid molecules, instead of beingtopoisomerase-charged, can contain a topoisomerase recognition site,wherein the method can further include contacting the reactants with atopoisomerase.

A method of the invention can be performed simultaneously orsequentially. A method of the invention can be performed sequentially,for example, such that the first ds nucleic acid molecule isdirectionally linked to the second ds nucleic acid molecule and, at alater time or in a different reaction vessel, the third ds nucleic acidmolecule is directionally linked to the second ds nucleic acid molecule.Alternatively, the method can be performed simultaneous, wherein all ofthe reactants are included together at the same time.

Methods of the invention are particularly useful for operatively linkingtwo or more (e.g., 2, 3, 4, 5, 6, 7, 8, etc.) ds nucleic acid molecules,including, for example, operatively linking an expression controlelement to an open reading frame, or operatively linking a first andsecond open reading frame to generate a recombinant nucleic acidmolecule encoding a fusion protein, which can be further operativelylinked to one or more expression control element. For example, inpracticing a method of the invention, a first ds nucleic acid moleculecan include an expression control element, a second ds nucleic acidmolecule can encode an open reading frame, and a third ds nucleic acidmolecule can encode a peptide, wherein, in the directionally linkedrecombinant nucleic acid molecule, the expression control element isoperatively linked to the open reading frame, and the second ds nucleicacid molecule is operatively linked to the third ds nucleic acidmolecule, and wherein the operatively linked second and third ds nucleicacid molecules encode a fusion protein comprising the open reading frameand the peptide. The peptide can be any peptide or polypeptide,including a gene product or other open reading frame, a tag (e.g., anaffinity tag), a detectable label, and/or the like.

The present invention also relates to a composition, which includes afirst ds nucleic acid molecule having a first end and a second end,wherein the first end has a 5′ overhang and a topoisomerase covalentlybound at the 3′ terminus; and a second ds nucleic acid molecule having afirst blunt end and a second end, wherein the first blunt end has afirst 5′ nucleotide sequence, which is complementary to the first5′-overhang, and a first 3′ nucleotide sequence complementary to thefirst 5′ nucleotide sequence. In such a composition, the first 5′nucleotide sequence of the first blunt end of the second ds nucleic acidmolecule can be hybridized to the first 5′ overhang of the first end ofthe first nucleic acid molecule, wherein the first 3′ nucleotidesequence of the first blunt end of the second ds nucleic acid moleculeis displaced. The first ds nucleic acid molecule in such a compositioncan further have a second 5′ overhang at the second end, and the secondend of the second ds nucleic acid molecule can further include a second5′ nucleotide sequence, which is complementary to the second 5′overhang, and a second 3′ nucleotide sequence complementary to thesecond 5′ nucleotide sequence.

The present invention also relates to kits, which contain one or morereagents useful for directionally linking ds nucleic acid molecules. Inone embodiment, a kit of the invention contains a ds nucleic acidmolecule having a first end and a second end, wherein the first endcontains a first 5′ overhang and a first topoisomerase covalently boundat the 3′ terminus, and the second end contains a second topoisomerasecovalently bound at the 3′ terminus and contains a second 5′ overhang, ablunt end, or a 3′ thymidine overhang, wherein the first 5′ overhang isdifferent from the second 5′ overhang. The topoisomerases can be thesame or different, and the ds nucleic acid molecule can be a vector, andcan contain an expression control element.

In another embodiment, a kit of the invention contains a first dsnucleic acid molecule, which has a first topoisomerase covalently boundat a 3′ terminus of a first end, and a second topoisomerase covalentlybound at a 3′ terminus of a second end, wherein the first end also has afirst 5′ overhang and the second end also has a blunt end, a 3′thymidine overhang, or a second 5′ overhang, wherein, when present, thesecond 5′ overhang is different from the first 5′ overhang; and aplurality of second ds nucleic acid molecules, wherein each ds nucleicacid molecule in the plurality has a first blunt end, and wherein thefirst blunt end includes a 5′ nucleotide sequence complementary to thefirst 5′ overhang of the first ds nucleic acid molecule. The second dsnucleic acid molecules in the plurality can be a pluralitytranscriptional regulatory elements, translational regulatory elements,or a combination thereof, or can encode a plurality of peptides such aspeptide tags, cell compartmentalization domains, and the like.

A kit of the invention can contain one or more (e.g., 1, 2, 3, 4, 5, 6,7, 8, etc.) topoisomerase-charged ds nucleic acid molecules of theinvention, for example, one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, etc.)topoisomerase-charged vectors; one or more (e.g., 1, 2, 3, 4, 5, 6, 7,8, etc.) precursor ds nucleic acid molecules, which can be contactedwith a topoisomerase to produce a topoisomerase-charged ds nucleic acidmolecule of the invention; or a combination thereof. The kit also cancontain one or more primers or primer pairs, for example, for preparingone or a plurality of second ds nucleic acid molecules using anamplification reaction; one or more control ds nucleic acid molecules totest or standardize the components of the kit; one or more cells, whichcan be, for example, competent cells into which a recombinant nucleicacid molecule generated according to a method of the invention can beintroduced; one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, etc.) reactionbuffers for performing a method of the invention; instructions forcarrying out the method; and the like.

In one embodiment, a method for generating a directionally ornon-directionally linked recombinant nucleic acid molecule is performedusing a first ds nucleic acid molecule with one single strandedoverhang, and one topoisomerase site or one topoisomerase bound thereto.In another embodiment, a third nucleic acid molecule is included. Inaccordance with this aspect of the invention, unique overhang sequencesfor the different ds nucleic acid molecules to be linked can be preparedhaving unique overhangs such that the nucleic acid molecules can belinked directionally and in any desired order. Similarly, the method canbe used to link any number of nucleic acid molecules, includingdirectionally linking two or more of the number of nucleic acidmolecules. In certain embodiments involving a topoisomerase-charged dsnucleic acid molecule containing an expression control element, a third(or other) ds nucleotide sequence also can comprise one, two or moreexpression control elements or other sequence of interest.

The present invention provides a method for the directional insertion ofDNA fragments into cloning or expression vectors with the ease andefficiency of topoisomerase-mediated cloning. This method has advantagesover current cloning systems because it decreases the laboriousscreening process necessary to identify cloned inserts in the desiredorientation. In one aspect, the method utilizes a linearized expressionvector having a single topoisomerase molecule covalently attached atboth 3′ ends. A first end of the linearized vector also can contain a 5′single stranded overhang, and the second end can be either blunt,possess a single 3′ thymidine extension for T/A cloning, or can itselfcontain a second 5′ single stranded overhang sequence. The singlestranded overhang sequences can be any convenient or desired sequence.

Construction of a topoisomerase-charged cloning vector can beaccomplished by endonuclease digestion of the vector, followed bycomplementary annealing of synthetic oligonucleotides and site-specificcleavage of the heteroduplex by Vaccinia topoisomerase I. Digestion of avector with any compatible endonuclease creates specific sticky ends.Custom oligonucleotides are annealed to these sticky ends, and possesssequences that, following topoisomerase I modification, form custom endsof the vector. The sequence and length of the single stranded overhangwill vary based on the desires of the user.

In a preferred use of the single strand sequence topoisomerase-chargedds nucleic acid vectors provided by the present invention, the DNAfragment to be inserted into the vector is an amplification reactionproduct such as a PCR product. Following PCR amplification with customprimers, the product can be directionally inserted into a topoisomeraseI charged cloning vector having a single strand sequence on one or bothends of the insertion site. The custom primers can be designed such thatat least one primer of a given primer pair contains an additionalsequence at its 5′ end. The added sequence is designed to becomplementary to the sequence of the single stranded overhang in thevector. The complementarity between the 5′ single stranded overhang inthe vector and the 5′ end of the PCR product mediates the directionalinsertion of the PCR product into the topoisomerase-mediated vector.Specifically, since only one end of the vector and one end of the PCRproduct possess complementary single stranded sequence regions, theinsertion of the product in this instance is directional, andtopoisomerase can catalyze ligation of the PCR product to the vector.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1A to 1E depict a number of ds nucleic acid molecules that can beused to practice various aspects of the invention. The circled and boxedareas shown in these depictions indicate regions which containsufficient nucleotide sequence complementarity to engage in strandinvasion with each other.

FIG. 1A shows two ds nucleic acid molecules (labeled “first” and“second” molecules) which each contain one terminus that is capable ofengaging in strand invasion with a terminus of the second molecule (seeboxes). When a topoisomerase is used to covalently link (e.g., ligate)strands of each molecule, the 3′ recessed strand of the terminus of thefirst molecule will generally be charged with topoisomerase. Further,this topoisomerase will generally catalyze the covalent linkage of the3′ recessed strand of the terminus of the first molecule to the 5′strand of the second molecule with which it engages in strand invasion(i.e., the 5′ terminus of the second nucleic acid molecule which isshown in the box).

FIG. 1B shows two ds nucleic acid molecules (labeled “first” and“second” molecules), each of which contains two termini that are capableof engaging in strand invasion with termini of the other molecule.Further, each of these two nucleic acid molecules has a blunt terminusand a terminus with a 5′ single stranded overhang (see circles andboxes). The nucleic acid molecules in this depiction can thus engage intwo separate strand invasion events which, upon covalent linkage ofnucleic acid strands at each termini, result in the formation of asingle, circular nucleic acid molecule. Covalent linkage of the terminican be performed as described, for example, for FIG. 1A, above.

FIG. 1C shows two ds nucleic acid molecules (labeled “first” and“second” molecules), each of which contains two termini that are capableof engaging in strand invasion with different termini of the othermolecule. Further, one of these molecules has two blunt termini and theother molecule has 5′ single stranded overhangs on each terminus. Themolecules in this depiction can thus engage in two separate strandinvasion events which, upon covalent linkage of nucleic acid strand ateach termini, result in the formation of a circular nucleic acidmolecule. Covalent linkage of the termini can be performed as describedfor FIG. 1A, above.

FIG. 1D shows three ds nucleic acid molecules (labeled “first”, “second”and “third” molecules). Two of these molecules (“first” and “third”molecules) contain 5′ single stranded overhangs which are capable ofengaging in strand invasion with different blunt termini of the othermolecule (“second” molecule). The molecules in this depiction can thusengage in two separate strand invasion events, which result in thegeneration of a linear nucleic acid molecule composed of all threemolecules. Covalent linkage of the termini can be performed as describedfor FIG. 1A, above.

FIG. 1E shows nucleic acid molecules similar to those set out in FIG.1D, above, except that one of the nucleic acid molecules (“second”molecule) has 5′ overhangs at both termini and the other two nucleicacid molecules (“first” and “second” molecules) each have two blunttermini.

FIG. 2 illustrates an aspect of the invention involving strand invasionof a first ds nucleic acid molecule with a substantially blunt endcontaining a topoisomerase at a 3′ terminus of a first strand containinga 5′ tail upstream of a topoisomerase recognition site; and a second dsnucleic acid molecule having a 3′ overhang complementary to the 5′ tail(see Cheng and Shuman, Mol. Cell. Biol. 20:8059-8068, 2000). The boxedareas shown in these depictions indicate regions which containsufficient nucleotide sequence complementarity to engage in strandinvasion with each other.

FIG. 3 provides the nucleotide sequence and the location of restrictionendonuclease recognition sequences of the Multiple Cloning Site ofpUni/V5-His version A (SEQ ID NO: 18), and a plasmid map of this 2.3 kbvector. EcoRI cloning site is located at nucleotide 471, and SacIcloning site is located at nucleotide 528.

FIG. 4 provides the nucleotide sequence and the location of restrictionendonuclease recognition sequences of the Multiple Cloning Site of pCR2.1, and a plasmid map of this vector. HindIII cloning site is locatedat nucleotide 234, SpeI is at nucleotide 258 and EcoRI is at nucleotide283 and nucleotide 299. The vector is 3906 nucleotides. LacZ alphafragment: bases 1-587; M13 reverse priming site: bases 205-221; Multiplecloning site: bases 234-355; T7 promoter/priming site: bases 362-381;M13 forward (−20) priming site: bases 389-404; M13 Forward (−40) primingsite: bases 408-424; f1 origin: bases 546-960; kanamycin resistance ORF:bases 1294-2088; ampicillin resistance ORF: bases 2106-2966; ColE1origin: bases 3111-3784. The illustrated vector represents the pCR®2.1vector with a PCR product inserted by TA Cloning®. Note that theinserted PCR product is flanked on each side by EcoRI sites. The arrowindicates the start of transcription for the T7 RNA polymerase.

FIG. 5 provides the nucleotide sequence of the Vector pUni/V5-Hisversion A sequence (SEQ ID NO:16).

FIG. 6 illustrates digestion of pUni/V5-His version A with EcoRI andSacI, and the resulting cohesive end sequences. The resulting cohesiveend on the left side of the figure near the loxP element is theresulting cohesive end post EcoRI digestion. The resulting cohesive endon the right side of the figure near the V5 element is the resultingcohesive end post SacI digestion. Vector elements including a loxP, V5,and 6×His element as well as a stop codon in frame with these elementsare indicated.

FIG. 7 illustrates the addition of adapter oligonucleotides to thedigested vector in the presence of DNA ligase. The reaction yields theexhibited linearized, adapted vector. Adapter sequences are underlinedfor demarcation. The four adaptor oligonucleotides have the followingsequences:

TOPO D1: 5′-AATTGATCCCTTCACCGACATAGTACAG-3′ (SEQ ID NO: 5) TOPO D2:3′-CTAGGGAAGTGG-5′ (SEQ ID NO: 6) TOPO D3: 3′-GACATGATACAGTTCCCGC-5′(SEQ ID NO: 8) TOPO D4: 5′-AAGGGCGAGCT-3′ (SEQ ID NO: 7)T4 ligation reaction will yield the indicated linearized cloning vector,adapter sequences are underlined for demarcation.

FIG. 8 illustrates a topoisomerase cleavage reaction wherein followingtopoisomerase cleavage of the scissile strand, a phosphate bond in thenon-scissile strand keeps the leaving group associated to the vector. Inthe reaction shown, topoisomerase is added to the depicted ds nucleicacid molecule. Topoisomerase binds CCCTT and breaks the adjacentphosphodiester bond. Phosphodiester bonds between the adapted vector andthe annealing oligo in the non-scissile strand prevent the dissociationof the leaving group upon cleavage. In the double stranded DNA modelillustrated, X and x represent complementary nucleotide bases.

FIG. 9 illustrates a topoisomerase cleavage reaction wherein followingtopoisomerase cleavage of the scissile strand, the lack of a phosphatebond in the non-scissile strand allows the leaving group to dissociatefrom the vector. In the reaction shown, topoisomerase is added to thedepicted ds nucleic acid molecule. Topoisomerase binds CCCTT and breaksthe adjacent phosphodiester bond. Lack of a phosphodiester bond betweenthe adapted vector and the annealing oligo in the non-scissile strandallows the dissociation of the leaving group upon cleavage. In thedouble stranded DNA model illustrated, X and x represent complementarynucleotide bases.

FIG. 10 illustrates that addition of an annealing oligonucleotide to thelinearized, adapted vector in the absence of DNA ligase yields theexhibited linearized, adapted and annealed vector. Note that theannealing oligonucleotide is not bound to the vector by a phosphatebond, thus, allowing dissociation following topoisomerase mediatedcleavage. Adapter oligonucleotides are demarcated by a single underline,while annealing oligonucleotides are demarcated by a double underline.There are no phophodiester linkages between either of the TOPO D3s andtheir adjacent oligonucleotides TOPO D2 and TOPO D5 The annealingoligonucleotide has the following sequence and is complementary to bothTOPO D1's and TOPO D4's single stranded overhang: TOPO D33′-CTGTATCATGTCAAC-5′ (SEQ ID NO:10).

FIG. 11 shows an example of a linearized topoisomerase-charged dsnucleic acid cloning vector of the invention. The single strandedoverhang corresponds to a Kozak transcription sequence. The vectorillustrated is a linearized TOPO flap cloning vector, modified pUni/H isversion A.

FIG. 12 is the nucleotide sequence of vector pCR 2.1 sequence (SEQ IDNO:17).

FIGS. 13A and 13B show forms of the pCR2.1® vector.

FIG. 13A shows pCR2.1® following restriction digestion with EcoRI andHindIII (note the resulting sticky ends). Four adapter oligonucleotideswere ligated to the linearized vector. TOPO binding sites on theoligonucleotides have the sequence CCCTT (underlined). Sticky endcomplementary bases are depicted in bold. The four adapteroligonucleotides had the following sequences:

TOPO H: 5′-AGCTCGCCCTTATTCCGATAGTG-3′; (SEQ ID NO: 11) TOPO 16:3′-GCGGGAATAAG; (SEQ ID NO: 12) TOPO 1: 5′-AATTCGCCCTTATTCCGATAGTG-3′;(SEQ ID NO: 13) and TOPO 2: 3′-GCGGGAA-5′TOPO H and TOPO 1 have 5′ ends that complement the HindIII and EcoRIsticky ends, respectively.

FIG. 13B shows the adapted version of pCR2.1® following incubation withthe adapter oligos in the presence of T4 ligase.

FIG. 14 illustrates the addition of annealing oligonucleotides to theadapted pCR2.1® vector, followed by the binding of topoisomerase I andthe topoisomerase mediated cleavage of the double stranded vector. Theresulting vector is linear and charged with topoisomerase I on bothends. Also, one end of the vector has the custom 4 bp single strandedsequence, while the other end is blunt. In the initial reactionillustrated, topoisomerase binds and cleaves the double stranded DNA atthe 5′ end of the covalent binding site located near the ends ofpCR2.1®, which contain the bound adapter and annealing oligonucleotides.This step is performed in the presence of T4 polynucleotide kinase. Theannealing oligonucleotides have the following sequences:

TOPO 3: 3′-TAAGGGTATCACAAC-5′; (SEQ ID NO: 15) and TOPO 17:3′-GCTATCAC-5′There are no phophodiester bonds formed between TOPO 3 and TOPO 2, orbetween TOPO 17 and TOPO 16. The annealing oligonucleotides are doubleunderlined for demarcation.

FIG. 15 illustrates a second example of a linearizedtopoisomerase-charged ds nucleic acid cloning vector of the presentinvention. In this example the single stranded overhang sequence is3′-TAAG-5′. This vector is the linearized, TOPO charged, FLAP vector,modified pCR2.1®.

FIG. 16 illustrates PCR amplification of a gene of interest usingprimers designed for directional cloning. The resulting productpossesses the necessary single stranded overhang for directional cloningusing a vector of the invention. The primer CACC depicted in the topillustration is homologous to the coding strand of the gene of interest,and has the “FLAP” sequence added to its 5′ end. Standard PCRamplification of the gene of interest in the presence of the appropriateprimers, including the CACC containing primer, gives the productdepicted in the bottom illustration. The product is a double strandedgene of interest amplicon with flap sequence at its 5′ end.

FIG. 17 illustrates double stranded nucleic acid vectors of the presentinvention, including a TOPO FLAP cloning vector, which possesses asingle stranded overhang, can facilitate insertion of amplified DNAtowards proper orientation. Once correctly inserted, topoisomerase willligate the product to the vector.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides compositions and methods of using strandinvasion to directionally or non-directionally link two or more doublestranded (ds) nucleic acid molecules. For example, the present inventionprovides a ds nucleic acid molecule having a first end and a second end,wherein the first end contains a first 5′ overhang and a firsttopoisomerase covalently bound at the 3′ terminus, and the second endcontains a second topoisomerase covalently bound at the 3′ terminus andcontains a second 5′ overhang, a blunt end, a 3′ uridine overhang, or a3′ thymidine overhang, wherein the first 5′ overhang is different fromthe second 5′ overhang. The first topoisomerase and second topoisomerasecan be the same or different. The first 5′ overhang can have anynucleotide sequence, including, for example, the nucleotide sequence5-GGTG.

Aspects of the present invention modify topoisomerase-mediated cloningso as to allow DNA fragments, including PCR-generated ORFs, to bedirectionally inserted into cloning vectors, while maintaining theadvantages provided by ligation using topoisomerase. The system greatlyreduces the amount of work involved in screening to identify clonescontaining inserts in the desired orientation by enabling directionalcloning efficiencies that are routinely in excess of 90%. The presentinvention streamlines high throughput gene expression operations andreduces costs associated with the screening process, and providesadditional advantages.

A topoisomerase-charged ds nucleic acid molecule of the inventiongenerally has a single stranded overhang and a first topoisomerasecovalently bound at or near a terminus of a first end. In addition, atopoisomerase-charged ds nucleic acid molecule of the invention caninclude a second topoisomerase covalently bound at or near a terminus ofthe second end. The single stranded overhang can be a 5′ overhang, andeach topoisomerase can be bound at or near one or both 3′ termini. Wherea topoisomerase is bound to one, or preferably both, 3′ termini, thesecond end of the topoisomerase-charged ds nucleic acid molecule of thepresent invention typically is a blunt end, a 3′ thymidine overhang, ora second 5′ overhang that is different from the first 5′ overhang.

As used herein, reference to a nucleic acid molecule having “a firstend” and “a second end” means that the nucleic acid molecule is linear.The term “single stranded overhang” or “overhang” is used herein torefer to a strand of a ds nucleic acid molecule that extends beyond theterminus of the complementary strand of the ds nucleic acid molecule.The term “5′ overhang” or “5′ overhanging sequence” is used herein torefer to a strand of a ds nucleic acid molecule that extends in a 5′direction beyond the 3′ terminus of the complementary strand of the dsnucleic acid molecule. The term “3′overhang” or “3“overhanging sequence”is used herein to refer to a strand of a ds nucleic acid molecule thatextends in a 3′ direction beyond the 5′ terminus of the complementarystrand of the ds nucleic acid molecule. Conveniently, a 5′ overhang canbe produced as a result of site specific cleavage of a ds nucleic acidmolecule by a type IB topoisomerase (see Examples 1 and 2). Similarly, a3′ overhang can be produced upon cleavage of a ds nucleic acid moleculeby a type IA or type II topoisomerase.

The 3′ overhang and 5′ overhang can have any nucleotide sequence and canbe any length (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, etc. nucleotides), butwill generally be at least two nucleotides. The overhanging sequencescan be selected such that the allow ligation of a predetermined end ofone ds nucleic acid molecule to a predetermined end of a second nucleicacid molecule according to a method of the invention. Where the doublestranded nucleic acid molecules are directionally linked, the 3′ or 5′overhangs are generally not palindromic because ds nucleic acidmolecules having palindromic overhangs can associate with each other,thus reducing the yield of a directionally linked recombinant nucleicacid molecule comprising two or more ds nucleic acid molecules in apredetermined orientation. The overhang can comprise, for example, anucleotide sequence of a transcriptional or translational regulatoryelement such as a promoter, Kozak sequence, start codon, or the like, ora complement of such a nucleotide sequence.”

A 3′ overhang or 5′ overhang can include virtually any nucleotide ornucleotide analog or modified nucleotide that can hybridize with acomplementary nucleotide residue, provided that at least a portion ofthe nucleotide sequence of the overhang can hybridize with thecomplementary sequence. Thus, the nucleosides in a overhang can includenaturally occurring nucleotides such as purines (guanosine (G) oradenosine (A)), or pyrimidines (thymidine (T), uridine (U) or cytidine(C)). Additionally, the overhang can include substitutes for thenucleosides, for example, a nucleoside such as inosine, or a modifiedform of a nucleoside such as methyl guanosine, or a 5-halogenatedpyrimidine nucleoside (e.g., 5-bromodeoxy uridine or 5-methyldeoxycytidine). If desired, the overhang can have a relatively high GCcontent, for example, the overhang can have a greater than 50% GCcontent, such as 66% GC or 75% GC or 80% GC or 100% GC content. In oneembodiment, the overhang has the sequence 5-GGTG-3′.

A 5′ or 3′ overhang of a first nucleic acid molecule, for example, caninclude one or two or a few nucleotide residues, for example, at thefree terminus of the overhang, for which a complementary nucleotideresidue is not present in the complementary sequence at or near thesubstantially blunt end of the second (or other) ds nucleic acidmolecule to which it is being linked. Nevertheless, the overhang at theend of the first nucleic acid molecule can selectively hybridize to thecomplementary sequence of the second nucleic acid molecule due to theother nucleotide residues in the overhang. For example, where a 5′overhang consists of six nucleotides, the 5′-most one or two nucleotidesneed not be complementary to the corresponding nucleotides in thecomplementary nucleotide sequence in the second nucleic acid molecule,but selective hybridization nevertheless can occur due to thecomplementarity of the remaining four nucleotide residues. The number orspecific positions of non-complementary nucleotide residues that can bein an overhang (or in the “complementary” sequence in the second nucleicacid molecule) without substantially reducing or inhibitinghybridization specificity can be determined using routine hybridizationmethods.

The nucleotide residues of the overhang can include locked nucleic acid(“LNA”) analogues (Proligo; Boulder Colo.). LNA monomers are bicycliccompounds that are structurally similar to ribonucleosides. The term“Locked Nucleic Acid” was coined to emphasize that the furanose ringconformation is restricted in an LNA by a methylene linker that connectsthe 2′-O position to the 4′-C position. As used herein, all nucleic acidmolecules containing one or more LNA modifications are referred to asLNA molecules. LNA oligomers obey Watson-Crick base pairing rules andhybridize to complementary oligonucleotides. LNA can provide vastlyimproved hybridization, stability, and increased thermal stabilityperformance when compared to DNA and other nucleic acid derivatives in anumber of situations (Koshkin et al., Tetrahedron 54:3607-30, 1998;Koshkin et al., J. Am. Chem. Soc. 120:13252-53, 1998; Wahlestedt et al.,Proc. Natl. Acad. Sci., USA 97:5633-38, 2000).

It should be recognized that reference to a first end or a second end ofa ds nucleic acid molecule is not intended to imply any particularorientation of the nucleic acid molecule, and is not intended to imply arelative importance of the ends with respect to each other. Where anucleic acid molecule having a first end and second end is a doublestranded nucleic acid molecule, each end contains a 5′ terminus and a 3′terminus. Thus, reference is made herein, for example, to a nucleic acidmolecule containing a topoisomerase recognition site at a 3′ terminusand a hydroxyl group at the 5′ terminus of the same end, which can bethe first end or the second end.

Topoisomerase when bound to a nucleic acid molecule, will generally bebound “at or near” a terminus of a ds nucleic acid molecule. The term“at or near” when used with respect to a topoisomerase, means that thetopoisomerase is covalently bound to one strand of a ds nucleic acidmolecule such that it can ligate the terminus of the strand to which itis bound, to a second nucleic acid molecule containing a free 5′terminal hydroxyl group. Generally, the topoisomerase is “at or near” anend by virtue of being covalently bound to one terminus of the end. Forexample, where the topoisomerase is a type IB topoisomerase such as aVaccinia topoisomerase, the topoisomerase is bound at the 3′ terminus ofan end of a ds nucleic acid molecule. However, an end having atopoisomerase covalently bound to a terminus of the end also can containa single stranded overhang sequence in the complementary strand, thusextending beyond the terminus to which the topoisomerase is bound. Sucha topoisomerase is an example of a topoisomerase near an end of the dsnucleic acid molecule.

As used herein, the term “isolated,” when used in reference to amolecule, means that the molecule is in a form other than that in whichit exists in nature. In general, an isolated nucleic acid molecule, forexample, can be any nucleic acid molecule that is not part of a genomein a cell, or is separated physically from a cell that normally containsthe nucleic acid molecule. It should be recognized that variouscompositions of the invention comprise a mixture of isolated ds nucleicacid molecules. As such, it will be understood that the term “isolated”only is used in respect to the isolation of the molecule from itsnatural state, but does not indicate that the molecule is an onlyconstituent.

Topoisomerases are a class of enzymes that modify the topological stateof DNA via the breakage and rejoining of DNA strands (Shuman et al.,U.S. Pat. No. 5,766,891, incorporated herein by reference).Topoisomerases are categorized as type I, including type IA and type IBtopoisomerases, which cleave a single strand of a double strandednucleic acid molecule, and type II topoisomerases (gyrases), whichcleave both strands of a nucleic acid molecule. As disclosed herein,type I and type II topoisomerases, as well as catalytic domains andmutant forms thereof, are useful for generating directionally linkedrecombinant nucleic acid molecules according to a method of theinvention. Type II topoisomerases have not generally been used forgenerating recombinant nucleic acid molecules or cloning procedures,whereas type IB topoisomerases, are used in a variety of procedures.

Type IA and IB topoisomerases cleave one strand of a ds nucleic acidmolecule. Cleavage of a ds nucleic acid molecule by type IAtopoisomerases generates a 5′ phosphate and a 3′ hydroxyl at thecleavage site, with the type IA topoisomerase covalently binding to the5′ terminus of a cleaved strand. In comparison, cleavage of a ds nucleicacid molecule by type IB topoisomerases generates a 3′ phosphate and a5′ hydroxyl at the cleavage site, with the type IB topoisomerasecovalently binding to the 3′ terminus of a cleaved strand. Type IAtopoisomerases include, for example, E. coli topoisomerase I andtopoisomerase III, eukaryotic topoisomerase II, and archeal reversegyrase (see Berger, Biochim. Biophys. Acta 1400:3-18, 1998, which isincorporated herein by reference).

Type IB topoisomerases include the nuclear type I topoisomerases presentin all eukaryotic cells and those encoded by Vaccinia and other cellularpoxviruses (see Cheng et al., Cell 92:841-850, 1998, which isincorporated herein by reference). The eukaryotic type LB topoisomerasesare exemplified by those expressed in yeast, Drosophila and mammaliancells, including human cells (see Caron and Wang, Adv. Pharmacol.29B:271-297, 1994; Gupta et al., Biochim. Biophys. Acta 1262:1-14, 1995,each of which is incorporated herein by reference; see, also, Berger,supra, 1998). Viral type IB topoisomerases are exemplified by thoseproduced by the vertebrate poxviruses (Vaccinia, Shope fibroma virus,ORF virus, fowlpox virus, and molluscum contagiosum virus), and theinsect poxvirus (Amsacta moorei entomopoxvirus) (see Shuman, Biochim.Biophys. Acta 1400:321-337, 1998; Petersen et al., Virology 230:197-206,1997; Shuman and Prescott, Proc. Natl. Acad. Sci., USA 84:7478-7482,1987; Shuman, J. Biol. Chem. 269:32678-32684, 1994; U.S. Pat. No.5,766,891; PCT/US95/16099; PCT/US98/12372, each of which is incorporatedherein by reference; see, also, Cheng et al., supra, 1998).

Type II topoisomerases include, for example, bacterial gyrase, bacterialDNA topoisomerase IV, eukaryotic DNA topoisomerase II, and T-even phageencoded DNA topoisomerases (Roca and Wang, Cell 71:833-840, 1992; Wang,J. Biol. Chem. 266:6659-6662, 1991, each of which is incorporated hereinby reference; Berger, supra, 1998). Like the type IB topoisomerases, thetype II topoisomerases have both cleaving and ligating activities. Inaddition, like type IB topoisomerase, substrate ds nucleic acidmolecules can be prepared such that the type II topoisomerase can form acovalent linkage to one strand at a cleavage site. For example, calfthymus type II topoisomerase can cleave a substrate ds nucleic acidmolecule containing a 5′ recessed topoisomerase recognition sitepositioned three nucleotides from the 5′ end, resulting in dissociationof the three nucleic acid molecule 5′ to the cleavage site and covalentbinding of the topoisomerase to the 5′ terminus of the ds nucleic acidmolecule (Andersen et al., supra, 1991). Furthermore, upon contactingsuch a type II topoisomerase-charged ds nucleic acid molecule with asecond nucleic acid molecule containing a 3′ hydroxyl group, the type IItopoisomerase can ligate the sequences together, and then is releasedfrom the recombinant nucleic acid molecule. As such, type IItopoisomerases also are useful for performing methods of the invention.

Structural analysis of topoisomerases indicates that the members of eachparticular topoisomerase families, including type IA, type IB and typeII topoisomerases, share common structural features with other membersof the family (Berger, supra, 1998). In addition, sequence analysis ofvarious type IB topoisomerases indicates that the structures are highlyconserved, particularly in the catalytic domain (Shuman, supra, 1998;Cheng et al., supra, 1998; Petersen et al., supra 1997). For example, adomain comprising amino acids 81 to 314 of the 314 amino acid Vacciniatopoisomerase shares substantial homology with other type IBtopoisomerases, and the isolated domain has essentially the sameactivity as the full length topoisomerase, although the isolated domainhas a slower turnover rate and lower binding affinity to the recognitionsite (see Shuman, supra, 1998; Cheng et al., supra, 1998). In addition,a mutant Vaccinia topoisomerase, which is mutated in the amino terminaldomain (at amino acid residues 70 and 72) displays identical propertiesas the full length topoisomerase (Cheng et al., supra, 1998). In fact,mutation analysis of Vaccinia type IB topoisomerase reveals a largenumber of amino acid residues that can be mutated without affecting theactivity of the topoisomerase, and has identified several amino acidsthat are required for activity (Shuman, supra, 1998). In view of thehigh homology shared among the Vaccinia topoisomerase catalytic domainand the other type IB topoisomerases, and the detailed mutation analysisof Vaccinia topoisomerase, it will be recognized that isolated catalyticdomains of the type 13 topoisomerases and type IB topoisomerases havingvarious amino acid mutations can be used in the methods of the inventionand thus are considered to be topoisomerases for purposes of the presentinvention.

The various topoisomerases exhibit a range of sequence specificity. Forexample, type II topoisomerases can bind to a variety of sequences, butcleave at a highly specific recognition site (see Andersen et al., J.Biol. Chem. 266:9203-9210, 1991, which is incorporated herein byreference.). In comparison, the type TB topoisomerases include sitespecific topoisomerases, which bind to and cleave a specific nucleotidesequence (“topoisomerase recognition site”). Upon cleavage of a dsnucleic acid molecule by a topoisomerase, for example, a type IBtopoisomerase, the energy of the phosphodiester bond is conserved viathe formation of a phosphotyrosyl linkage between a specific tyrosineresidue in the topoisomerase and the 3′ nucleotide of the topoisomeraserecognition site. Where the topoisomerase cleavage site is near the 3′terminus of the nucleic acid molecule, the downstream sequence (3′ tothe cleavage site) can dissociate, leaving a nucleic acid moleculehaving the topoisomerase covalently bound to the newly generated 3′ end(see FIG. 9).

The covalently bound topoisomerase also can catalyze the reversereaction, for example, covalent linkage of the 3′ nucleotide of therecognition sequence, to which a type IB topoisomerase is linked throughthe phosphotyrosyl bond, and a nucleic acid molecule containing a free5′ hydroxyl group. As such, methods have been developed for using a typeIB topoisomerase to produce recombinant nucleic acid molecules. As such,cloning vectors containing a bound type IB topoisomerase have beendeveloped and are commercially available (Invitrogen Corp., CarlsbadCalif.). Such cloning vectors, when linearized, contain a covalentlybound type IB topoisomerase at each 3′ end (“topoisomerase-charged”).Nucleic acid molecules such as those comprising a cDNA library, orrestriction fragments, or sheared genomic DNA sequences that are to becloned into such a vector are treated, for example, with a phosphataseto produce 5′ hydroxyl termini, then are added to the linearized vectorunder conditions that allow the topoisomerase to ligate the nucleic acidmolecules at the 5′ terminus containing the hydroxyl group and the 3′terminus containing the covalently bound topoisomerase. A nucleic acidmolecule such as a PCR amplification product, which is producedcontaining 5′ hydroxyl ends, can be cloned into a topoisomerase-chargedvector in a rapid joining reaction (approximately 5 min at roomtemperature). The rapid joining and broad temperature range inherent tothe topoisomerase joining reaction makes the use oftopoisomerase-charged vectors ideal for high throughput applications,which generally are performed using automated systems.

Vaccinia virus encodes a 314 amino acid type I topoisomerase enzymecapable of site-specific single-strand nicking of double stranded DNA,as well as 5′ hydroxyl driven religation. Site-specific type Itopoisomerases include, but are not limited to, viral topoisomerasessuch as pox virus topoisomerase. Examples of pox virus topoisomerasesinclude Shope fibroma virus and ORF virus. Other site-specifictopoisomerases are well known to those skilled in the art and can beused to practice this invention.

Vaccinia topoisomerase binds to duplex DNA and cleaves thephosphodiester backbone of one strand while exhibiting a high level ofsequence specificity. Cleavage occurs at a consensus pentapyrimidineelement 5′-(C/T)CCTT↓, or related sequences in the scissile strand. Inone embodiment the scissile bond is situated in the range of 2 to 12 bpfrom the 3′ end of the duplex DNA. In another embodiment cleavablecomplex formation by Vaccinia topoisomerase requires six duplexnucleotides upstream and two nucleotides downstream of the cleavagesite. Examples of Vaccinia topoisomerase cleavable sequences include,but are not limited to, +6/−6 duplex GCCCTTATTCCC (SEQ ID NO:1), +8/-4duplex TCGCCCTTATTC (SEQ ID NO:2), +10/−2 duplex TGTCGCCCTTAT (SEQ IDNO:3), +11/−1 duplex GTGTCGCCCTTA (SEQ ID NO:4).

Examples of other site-specific type I topoisomerases are well known inthe art. These enzymes are encoded by many organisms including, but notlimited to Saccharomyces cerevisiae, Saccharomyces pombe andTetrahymena, however, the topoisomerase I enzymes of these species haveless specificity for a consensus sequence than does the Vacciniatopoisomerase. (Lynn et al., Proc. Natl. Acad. Sci. USA 86: 3559-3563,1989; Eng et al., J. Biol. Chem. 264: 13373-13376, 1989; Busk et al.,Nature 327: 638-640, 1987).

The compositions and methods of the invention are exemplified generallyherein with reference to the use of type IB topoisomerase such as theVaccinia topoisomerase. However, it will be recognized that the methodsalso can be performed using other topoisomerases merely by adjusting thecomponents accordingly. For example, as described in greater detailbelow, methods are disclosed for incorporating a type IB topoisomeraserecognition site at one or both 3′ termini of a ds nucleic acidmolecule. Accordingly, in view of the present disclosure, the artisanwill recognize that a topoisomerase recognition site for a type IA ortype II topoisomerase similarly can be incorporated into a ds nucleicacid molecule.

A topoisomerase-charged ds nucleic acid molecule that contains a 5′overhang on a first end generally contains a topoisomerase covalentlybound to the 3′ terminus of the first end. A ds nucleic acid containinga 5′ overhang and first topoisomerase at a first end, also can contain asecond topoisomerase covalently bound to the second end. Thetopoisomerase covalently bound to the first end can be the same as ordifferent from the topoisomerase covalently bound to the second end.Thus, a Vaccinia topoisomerase can be covalently bound to a first endand another poxvirus or eukaryotic nuclear type IB topoisomerase can bebound to the second end. Generally, where the topoisomerases at each endare different, they are members of the same general family, for example,type IA or type IB or type II topoisomerase.

In one embodiment, a topoisomerase-charged double stranded nucleic acidmolecule of the invention is a vector, which can be a cloning vector oran expression vector. The vector can include elements such as abacterial origin of replication, a eukaryotic origin of replication,antibiotic resistance genes, and the like, and can further includetopoisomerase recognition sites or topoisomerase-charged ends or acombination thereof. Such vectors of the invention can conveniently bepackaged into kits as disclosed herein. A vector of the invention can bea plasmid vector, a cosmid vector, an artificial chromosome (e.g., abacterial artificial chromosome, a yeast artificial chromosome, amammalian artificial chromosome, etc.) or a viral vector such as abacteriophage, baculovirus, retrovirus, lentivirus, adenovirus, Vacciniavirus, semliki forest virus and adeno-associated virus vector, all ofwhich are well known and can be purchased from commercial sources(Promega, Madison Wis.; Stratagene, La Jolla Calif.; GIBCO/BRL,Gaithersburg Md.). Viral expression vectors can be particularly usefulwhere a method of the invention is practiced for the purpose ofgenerating a directionally linked recombinant nucleic acid molecule thatis to be introduced into a cell, particularly a cell in a subject. Viralvectors provide the advantage that they can infect host cells withrelatively high efficiency and can infect specific cell types or can bemodified to infect particular cells in a host.

Viral vectors have been developed for use in particular host systems andinclude, for example, baculovirus vectors, which infect insect cells;retroviral vectors, other lentivirus vectors such as those based on thehuman immunodeficiency virus (HIV), adenovirus vectors, adeno-associatedvirus (AAV) vectors, herpesvirus vectors, Vaccinia virus vectors, andthe like, which infect mammalian cells (see Miller and Rosman,BioTechniques 7:980-990, 1992; Anderson et al., Nature 392:25-30 Suppl.,1998; Verma and Somia, Nature 389:239-242, 1997; Wilson, New Engl. J.Med. 334:1185-1187, 1996, each of which is incorporated herein byreference). For example, a viral vector based on an HIV can be used toinfect T cells, a viral vector based on an adenovirus can be used, forexample, to infect respiratory epithelial cells, and a viral vectorbased on a herpesvirus can be used to infect neuronal cells. Othervectors, such as AAV vectors can have greater host cell range and,therefore, can be used to infect various cell types, although viral ornon-viral vectors also can be modified with specific receptors orligands to alter target specificity through receptor mediated events.

A linearized vector of the invention, which is topoisomerase-charged orcontains topoisomerase recognition sites can be generated using methodsas disclosed herein or otherwise known in the art. For example, acircular vector can be linearized, and modified by ligating orhybridizing one or more oligonucleotides, to generate a topoisomeraserecognition site, or a cleavage product thereof, and a target 5′sequence or 5′ overhang, at one or both ends (see Examples 1 and 2). Thevector also can contain, for example, expression control elementsrequired for replication in a prokaryotic host cell, a eukaryotic hostcell, or both, and can contain a nucleotide sequence encoding apolypeptide that confers antibiotic resistance or the like, or suchelements can be introduced into the vector using the methods of theinvention. Furthermore, the vector can contain one, two, or more sitespecific integration recognition site such as an att site or lox site.The incorporation, for example, of attB or attP sequences into anisolated nucleic acid molecule of the present invention allows for theconvenient manipulation of the nucleic acid molecule using the GATEWAY™Cloning System (Invitrogen Corp., La Jolla Calif.).

The invention provides a modified cloning vector having an overhangingsingle stranded piece of DNA charged with topoisomerase. The modifiedvector allows the directional insertion of linear ds nucleic acidmolecules, for example PCR amplified, or otherwise suitable ORFs, forsubsequent expression, and takes advantage of topoisomerase cloningefficiency. As used herein, the term donor signifies molecules such as aduplex DNA which contains a 5′-CCCTT cleavage site near the 3′ end, andthe term acceptor signifies a duplex DNA which contains a 5′-OHterminus. Once covalently activated by topoisomerase the donor will betransferred to those acceptors to which it has single strand sequencecomplementation.

According to the present invention, in particular embodimentstopoisomerase-modified vectors are further adapted to contain at leastone 5′ single-stranded overhang sequence to facilitate the directionalinsertion of DNA segments. A nucleic acid molecule to be cloned intosuch a vector can be a PCR product constituting an ORF, which can beexpressed from the resultant recombinant vector. The primers used foramplifying the ORF are designed such that at least one primer of theprimer pair contains an additional sequence at its 5′ end. This sequenceis designed to be complementary to the sequence of the 5′single-stranded overhang present in the topoisomerase-modified vector ofthe present invention.

The present invention generally provides methods for generating adirectionally or non-directionally linked recombinant nucleic acidmolecule, by using a strand invasion event and a ligation event to link,in a directional or non-directional manner, a first nucleic acidmolecule and at least a second nucleic acid molecule. As used herein,the term “strand invasion” refers to the displacement of one strand of afirst double stranded nucleic acid molecule by a single stranded portionof a second nucleic acid molecule, wherein the single strand hasnucleotide sequence that is substantially identical to the displacedstrand and can selectively hybridize to the strand complementary to thedisplaced strand.

A method for generating a directionally or non-directionally linkedrecombinant nucleic acid molecule can be performed, for example, bycontacting a first ds nucleic acid molecule having a first overhang on afirst strand (e.g., a 3′ or 5′ strand) at a first end; and a second dsnucleic acid molecule having a first substantially blunt end and asecond end, wherein a nucleotide sequence that is complementary to thefirst overhang of the first end of the first nucleic acid molecule, ispresent at or near the first substantially blunt end. The method isperformed under conditions such that the first overhang can selectivelyhybridize to the complementary nucleotide sequence of the firstsubstantially blunt end of the second ds nucleic acid molecule, and thefirst end of the first ds nucleic acid molecule and the first end of thesecond ds nucleic acid molecule can be linked. The first overhang can bea 3′ overhang or a 5′ overhang. The invention further provides precursornucleic acid molecules which can be used to prepare molecules suitablefor use in the method described above. The invention also providesnucleic acid molecules prepared by the above method.

FIG. 1 illustrates examples of ways in which the methods of theinvention can be used to generate a covalently linked recombinantnucleic acid molecule. The boxes and circles in FIG. 1 are used todepict regions of sequence complementarity such that a stranddisplacement event can occur. The other end of the ds nucleic acidmolecules, which do not necessarily (but can) involve a stranddisplacement event, can have any structure, including, can besubstantially blunt or can include a 3′ or 5′ overhang. Othercombinations of blunt ends and/or overhangs on the first, second, third,etc. ds nucleic acid molecules can be linked according to the methods ofthe invention and will be evident to those in the art based, in part, onthe examples provided in FIG. 1.

As shown in FIG. 1A, a method for generating a directionally ornon-directionally linked recombinant nucleic acid molecule can beperformed, for example, by contacting a first ds nucleic acid molecule,which has a first overhang on a first strand at a first end; a second dsnucleic acid molecule, which has a first substantially blunt end and asecond end, wherein the first substantially blunt end has a nucleotidesequence that is complementary to the first overhang of the first end ofthe first nucleic acid molecule; and a reagent for ligating the nucleicacid molecules (e.g., a reagent comprising a topoisomerase, a ligase, ora recombinase). The method is performed under conditions such that thefirst overhang can selectively hybridize to the complementary nucleotidesequence of the first substantially blunt end of the second ds nucleicacid molecule. Furthermore, the method is performed in the presence of areagent that can ligate a 5′ strand of one nucleic acid molecule to a 3′strand of a second nucleic acid molecule, and under conditions such thatthe 3′ terminus of the first end of the first ds nucleic acid moleculeand the 5′ terminus of the first end of the second ds nucleic acidmolecule are linked. In various modifications of the method describedabove, as well as in other methods described above, the first overhangcan be a 3′ overhang or a 5′ overhang.

As shown in FIG. 1B, a method for generating a linked, for exampledirectionally linked, recombinant nucleic acid molecule can be performedby contacting a first ds nucleic acid molecule with a first overhang ona first strand at a first end and a second substantially blunt end; anda second ds nucleic acid molecule, which has a first substantially bluntend and a second end which has a second overhang, wherein the firstsubstantially blunt end of the second ds nucleic acid molecule has anucleotide sequence that is complementary to the first overhang of thefirst end of the first nucleic acid molecule, and wherein the secondsubstantially blunt end of the first nucleic acid molecule has anucleotide sequence that is complementary to the second overhang of thesecond end of the second ds nucleic acid molecule. The method isperformed under conditions such that the first overhang can selectivelyhybridize to the complementary nucleotide sequence of the firstsubstantially blunt end of the second ds nucleic acid molecule, andwherein the second overhang can selectively hybridize to thecomplementary nucleotide sequence of the second substantially blunt endof the first ds nucleic acid molecule. Furthermore, the method may beperformed in the presence of a reagent which can catalyze the ligationof a 3′ strand of one nucleic acid molecule to a 5′ strand of anothernucleic acid molecule, and under conditions such that the 3′ terminus ofthe first end of the first ds nucleic acid molecule is linked to the 5′terminus of the first end of the second ds nucleic acid molecule, andthe 3′ terminus of the second substantially blunt end of the first dsnucleic acid molecule is linked to the 5′ terminus of the second end ofthe second ds nucleic acid molecule. In various modifications of themethod described above, one or both the first and second overhangs canbe 3′ overhangs or 5′ overhangs. The ds nucleic acid molecules can thusengage in two separate strand invasion events which, upon covalentlinkage of nucleic acid strands at each termini, result in the formationof a circular recombinant nucleic acid molecule.

As shown in FIG. 1C, a method for generating a linked, for exampledirectionally linked, recombinant nucleic acid molecule can beperformed, for example, by contacting a first ds nucleic acid moleculewith a first overhang on a first strand at a first end and a second endhaving a second overhang; and a second ds nucleic acid molecule, whichhas a first substantially blunt end and a second substantially bluntend, wherein the first substantially blunt end of the second ds nucleicacid molecule has a nucleotide sequence that is complementary to thefirst overhang of the first end of the first nucleic acid molecule, andwherein the second substantially blunt end of the second nucleic acidmolecule has a nucleotide sequence that is complementary to the secondoverhang of the second end of the first ds nucleic acid molecule. Themethod may be performed under conditions such that the first overhangcan selectively hybridize to the complementary nucleotide sequence ofthe first substantially blunt end of the second ds nucleic acidmolecule, and wherein the second overhang can selectively hybridize tothe complementary nucleotide sequence of the second substantially bluntend of the second ds nucleic acid molecule. Furthermore, the method isperformed under conditions such that the first end of the first dsnucleic acid molecule is linked to the first end of the second dsnucleic acid molecule, and the second end of the first ds nucleic acidmolecule is linked to the second substantially blunt end of the secondds nucleic acid molecule. In various modifications of the methoddescribed above, one or both the first and second overhangs can be 3′overhangs or 5′ overhangs. The ds nucleic acid molecules can thus engagein two separate strand invasion events, which, upon covalent linkage ofnucleic acid strands at each termini, result in the formation of acircular recombinant nucleic acid molecule.

As shown in FIG. 1D, a method for generating a linked, for exampledirectionally linked, recombinant nucleic acid molecule can beperformed, for example, by contacting a first ds nucleic acid molecule,which has a first overhang on a first strand at a first end; a second dsnucleic acid molecule, which has a first substantially blunt end and asecond substantially blunt end; and a third ds nucleic acid moleculewhich has a second overhang on a first strand at a first end, whereinthe first substantially blunt end of the second ds nucleic acid moleculehas a nucleotide sequence that is complementary to the first overhang,and the second substantially blunt end of the second ds nucleic acidmolecule has a nucleotide sequence that is complementary to the secondoverhang. The method is performed under conditions such that the firstoverhang can selectively hybridize to the complementary nucleotidesequence of the first substantially blunt end of the second ds nucleicacid molecule, and wherein the second overhang can selectively hybridizeto the complementary nucleotide sequence of the second substantiallyblunt end of the second ds nucleic acid molecule. Furthermore, themethod may be performed under conditions such that the first end of thefirst ds nucleic acid molecule is linked to the first end of the secondds nucleic acid molecule, and the first end of the third ds nucleic acidmolecule is linked to the second substantially blunt end of the secondds nucleic acid molecule. In various modifications of the methoddescribed above, one or both the first and second overhangs can be 3′overhangs or 5′ overhangs.

As shown in FIG. 1E, a method for generating a linked, for exampledirectionally linked, recombinant nucleic acid molecule can beperformed, for example, by contacting a first ds nucleic acid molecule,which has a first substantially blunt end; a second ds nucleic acidmolecule which has a first overhang on a first strand at a first end anda second overhang on a second strand at a second end; and third dsnucleic acid molecule, which has a second substantially blunt end,wherein the first substantially blunt end of the first ds nucleic acidmolecule has a nucleotide sequence that is complementary to the firstoverhang of the first end of the second nucleic acid molecule, andwherein the second substantially blunt end has a nucleotide sequencethat is complementary to the second overhang of the second end of thesecond ds nucleic acid molecule. The method is performed underconditions such that the first overhang can selectively hybridize to thecomplementary nucleotide sequence of the first substantially blunt endof the first ds nucleic acid molecule, and wherein the second overhangcan selectively hybridize to the complementary nucleotide sequence ofthe second substantially blunt end. Furthermore, the method may beperformed under conditions such that the first substantially blunt endof the first ds nucleic acid molecule is linked to the first end of thesecond ds nucleic acid molecule, and the second substantially blunt end,located on the third ds nucleic acid molecule, is linked to the secondend of the second ds nucleic acid molecule. In various modifications ofthe method described above, one or both the first and second overhangscan be 3′ overhangs or 5′ overhangs.

A method for generating a directionally or non-directionally linkedrecombinant nucleic acid molecule can be performed, for example, bycontacting a first ds nucleic acid molecule, which has a firsttopoisomerase covalently bound at or near a first substantially bluntend; and a second ds nucleic acid molecule, which has a first 3′overhang on a first strand at a first end, wherein the firstsubstantially blunt end of the first ds nucleic acid molecule has anucleotide sequence that is complementary to the first 3′ overhang (seeFIG. 2). The method is performed under conditions such that the first 3′overhang can selectively hybridize to the complementary nucleotidesequence of the first substantially blunt end of the first ds nucleicacid molecule. Furthermore, the method is performed such thattopoisomerase can covalently link the 3′ terminus of the first end ofthe first ds nucleic acid molecule to the 5′ terminus of the first endof the second ds nucleic acid molecule (Cheng and Shuman, Mol. Cell.Biol. 20:8059-8068, 2000, which is incorporated herein by reference inits entirety).

The ability of a topoisomerase covalently bound at or near a firstsubstantially blunt end of a first ds nucleic acid molecule tocovalently linked to a second ds nucleic acid molecule with a 3′overhang (FIG. 2) illuminates a previously unappreciated conformationalflexibility of covalently bound topoisomerase with respect to its DNAcontacts on the 5′ side of the scissile phosphodiester. In catalyzingthe relaxation of supercoiled DNA, covalently bound topoisomerase typeIB releases the downstream duplex and permits rotation of the duplexaround the phosphodiester bond opposite the scissile phosphate beforeresealing the backbone.

A method of the present invention can be performed in a manner thatobviates the need to perform an additional reaction to repair aremaining nick after the homology dependent ligation, by substantiallyreplacing one strand of a nucleic acid molecule (see FIG. 2). Forexample, the methods can be performed such that the overhang sequence ofone nucleic acid molecule extends the entire length of and, upon strandinvasion, replaces a strand of the other ds nucleic acid molecule. Thus,in this embodiment, there is no nick in the strand that was not ligatedby the topoisomerase.

The termini of the ds nucleic acid molecules that are linked using themethods of the current invention can be covalently linked, using anyreagent useful for linking a 5′ terminus of a one nucleic acid moleculeto a 3′ terminus of a second nucleic acid molecule. Thus, the reagentfor covalently linking the termini can be, for example, a topoisomerase,including a type IA, type IB or type II topoisomerase; a ligase such asT4 DNA ligase; a recombinase, including FLIP recombinase, Int integrase,or cre recombinase; or another INT family member (see, for example,Nucl. Acids Res. 26:391-406, 1998). Furthermore, where one nick remainsafter a ligation of one strand, the nick can be closed by an in vivoligation, for example by introduction into a cell, such as E. coli, ofthe nicked ds nucleic acid molecule. In certain preferred embodiments,the linking of the two ends involved in a strand displacement eventinvolves topoisomerase ligation. Furthermore, in a method as disclosedfor linking a first ds nucleic acid molecule and a second ds nucleicacid molecule, a third nucleic acid molecule also can be linked to anend of the first or second nucleic acid molecule that is not involved ina strand displacement event.

Methods of the present invention can be performed to link a first dsnucleic acid molecule to at least a second ds nucleic acid molecule in anon-directional or, preferably, a directional manner. However, themethods can be used to non-directionally link the first ds nucleic acidmolecule and the second ds nucleic acid molecule in embodiments wherecomplementarity exists between nucleotide sequences at or near aterminus of both ends of one of the ds nucleic acid molecules andnucleotide sequences at or near a terminus of at least one end of theother ds nucleic acid molecule. Such complementarity between nucleotidesequences at or near a terminus of both ends of one ds nucleic acidmolecule and at least one strand of the second ds nucleic acid moleculecan be achieved, for example, by including identical nucleotidesequences at the same terminus (i.e., 5′ or 3′) of both ends of a dsnucleic acid molecule. This can be accomplished, for example, bydesigning target sequences that can be cleaved with the same restrictionenzyme and which contain the same nucleotide sequence.

The present invention also relates to methods of directionally ornon-directionally linking a first and at least a second nucleic acidmolecule, including, as desired, operatively linking two or more (e.g.,2, 3, 4, 5, 6, 7, etc.) of the nucleic acid molecules. A method forgenerating a directionally or non-directionally linked recombinantnucleic acid molecule can be performed, for example, by contacting atopoisomerase-charged first ds nucleic acid molecule, which has a firsttopoisomerase covalently bound at a first end, and a secondtopoisomerase covalently bound at a second end, and also contains a 5′overhang at the first end and a blunt end, a 3′ uridine overhang, a 3′thymidine overhang, or a second 5′ overhang at the second end; and asecond ds nucleic acid molecule, which has a first blunt end and asecond end, wherein the first blunt end has a 5′ nucleotide sequencethat is complementary to the first 5′ overhang of the first end of thefirst nucleic acid molecule. The first and second topoisomerases can bethe same, for example, two type IB topoisomerases, including twoVaccinia type IB topoisomerases, or can be different, including two typeIB topoisomerases from different organisms or a type IB topoisomeraseand a type IA or a type II topoisomerase.

In performing a method of the invention, the first and second ds nucleicacid molecules are contacted under conditions such that the 5′nucleotide sequence of the second nucleic acid molecule can selectivelyhybridize to the first 5′ overhang, whereby the first topoisomerase cancovalently link the 3′ terminus of the first end of the first ds nucleicacid molecule to the 5′ terminus of the first end of the second dsnucleic acid molecule, and the second topoisomerase can covalently linkthe 3′ terminus of the second end of the first ds nucleic acid moleculeto the 5′ terminus of the second end of the second ds nucleic acidmolecule, to generate a directionally or non-directionally linkedrecombinant nucleic acid molecule. Accordingly, the present inventionprovides a directionally or non-directionally linked recombinant nucleicacid molecule produced by such a method.

As disclosed herein, a method of the invention can provide a means todirectionally link two or more ds nucleotides in a predetermineddirectional orientation. The term “directionally link” is used herein torefer to the covalent linkage of two or more nucleic acid molecules in aparticular predetermined order and/or orientation. Thus, a method of theinvention provides a means, for example, to covalently link a promoterexpression control element upstream of a coding sequence, and tocovalently link a polyadenylation signal downstream of the coding regionto generate a functional expressible recombinant nucleic acid molecule;or to covalently link two coding sequences such that they can betranscribed and translated in frame to produce a fusion polypeptide. Theterm “non-directionally link” is used herein to refer to the covalentlinkage of two or more nucleic acid molecules in a random order, i.e.,either of the first or second end of the nucleic acid molecule can belinked to an end of another nucleic acid molecule.

The term “substantially blunt,” when used in reference to an end of a dsnucleic acid molecule, means that the end can be blunt or can have ashort overhang that does not reduce or inhibit a strand invasion eventby a second nucleic acid molecule having an overhang. For example, asubstantially blunt end can include an end having an overhang of 1, 2,or a few nucleotides, provided the overhang at the substantially bluntend does not block strand invasion. For example, the second ds nucleicacid molecule can have a 5′ adenosine or 5′ inosine overhang.

It should be recognized that reference herein to a “first nucleic acidmolecule,” “second nucleic acid molecule,” “third nucleic acidmolecule,” and the like, is used only to provide a means to indicatewhich of several nucleic acid molecules is being referred to. Thus,absent any specifically defined characteristic with respect to aparticular nucleic acid molecule, the terms “first,” “second,” “third”and the like, when used in reference to a nucleic acid molecule, or apopulation or plurality of nucleic acid molecules, are not intended toindicate any particular order, importance or other information about thenucleic acid molecule. Thus, where an exemplified method refers, forexample, to using PCR to amplify a first ds nucleic acid molecule suchthat the amplification product contains a topoisomerase recognition siteat one or both ends, it will be recognized that, similarly, a second (orother) ds nucleic acid molecule also can be so amplified.

The term “at least a second,” when used in reference to a ds nucleicacid molecule, means one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,etc.) nucleic acid molecules in addition to a first ds nucleic acidmolecule. Thus, the term can refer to only a second nucleic acidmolecule, or to a second nucleic acid molecule and a third nucleic acidmolecule (or more). As such, the term “second (or other) ds nucleic acidmolecule” or “second (and other) ds nucleic acid molecules” is usedherein in recognition of the fact that the term “at least a secondnucleic acid molecule” can refer to a second, third or more nucleic acidmolecules. It should be recognized that, unless indicated otherwise, anucleic acid molecule encompassed within the meaning of the term “atleast a second nucleic acid molecule” can be the same or substantiallythe same as a first nucleic acid molecule. For example, a first andsecond ds nucleic acid molecule can be the same except that only one ofthe molecules, for example, the first ds nucleic acid molecule, has atopoisomerase recognition site, or except for having complementary 5′overhanging sequences, for example, produced upon cleavage by atopoisomerase, such that the first and second ds nucleic acid moleculescan be directionally linked using a method of the invention. As such, amethod of the invention can be used to produce a concatenate of firstand second ds nucleic acid molecules, which, optionally, can beinterspersed, for example, by a third ds nucleic acid molecule such asan expression control element, and can contain the directionally linkedsequences in a predetermined directional orientation, for example, eachin a 5′ to 3′ orientation with respect to each other.

It will be recognized that each of the ds nucleic acid molecules, forexample, a sequence referred to as a first ds nucleic acid molecule,generally comprises a population of such nucleic acid molecules, whichare identical or substantially identical to each other. Thus, it shouldbe clear that the term “different” is used in comparing, for example, afirst (or population of first) ds nucleic acid molecules with a second(and other) ds nucleic acid molecule. As used herein, the term“different,” when used in reference to the ds nucleic acid molecules ofa composition of the invention, means that the ds nucleic acid moleculesshare less than 95% sequence identity with each when optimally aligned,generally less than 90% sequence identity, and usually less than 70%sequence identity. Thus, ds nucleic acid molecules that, for example,differ only in being polymorphic variants of each other or that merelycontain different 5′ or 3′ overhanging sequences are not considered tobe “different” for purposes of a composition of the invention. Incomparison, different ds nucleic acid molecules are exemplified by afirst sequence encoding a polypeptide and second sequence comprising aexpression control element, or a first sequence encoding a firstpolypeptide a second sequence encoding a non-homologous polypeptide.

The term “recombinant” is used herein to refer to a nucleic acidmolecule that is produced by linking at least two nucleic acidmolecules. As such, a recombinant nucleic acid molecule encompassedwithin or generated according to a method of the invention isdistinguishable from a nucleic acid molecule that may be produced innature, for example, during meiosis. A recombinant nucleic acid moleculegenerated according to a method of the invention can be identified, forexample, by the presence of the complementary nucleic acid sequence inclose proximity, generally directly adjacent, and usually directly 3′,to a topoisomerase binding site in a double stranded nucleic acidmolecule.

As disclosed herein, a method of the invention can be used todirectionally or non-directionally link a first ds nucleic acid moleculeto a second ds nucleic acid molecule. In many embodiments, the methodmay be used to directionally link a first ds nucleic acid molecule and asecond (or other) ds nucleic acid molecule. However, use of the methodto non-directionally link a first ds nucleic acid molecule and a second(or other) ds nucleic acid molecule also provides advantages.Non-directional linking can be performed, for example, 1) where a secondnucleotide sequence is present at or near the 5′ terminus of the secondend of the first ds nucleic acid molecule, which can form all or part ofa second overhang, and is capable of hybridizing to the 5′ complementarynucleotide sequence of the second ds nucleic acid molecule; and 2) wherea nucleotide sequence is present at or near the 5′ terminus of both thefirst end and the second end of the second ds nucleic acid molecule thatis capable of hybridizing to the 5′ overhang of the first end of thefirst ds nucleic acid molecule. In these embodiments, the second end ofthe first ds nucleic acid molecule and the second end of the second dsnucleic acid molecule can be either blunt, or include an overhang.

In another embodiment, a method for generating a directionally ornon-directionally linked recombinant nucleic acid molecule can beperformed, for example, by contacting a first precursor ds nucleic acidmolecule having a first end, which has a first 5′ target sequence at the5′ terminus and a topoisomerase recognition site at or near the 3′terminus, and a second end, which has a topoisomerase recognition siteat or near the 3′ terminus; a second ds nucleic acid molecule having afirst blunt end and a second end, wherein the first blunt end has a 5′nucleotide sequence complementary to the 5′ target sequence of the firstprecursor ds nucleic acid molecule; and a topoisomerase that is specificfor the topoisomerase recognition site. As used herein, reference to a“precursor” ds nucleic acid molecule means a ds nucleic acid moleculethat contains a topoisomerase recognition site and that, upon cleavageby a topoisomerase specific for the recognition site, produces an endhaving a desired 5′ terminal nucleotide sequence, 3′ terminal nucleotidesequence, or both. Such a desired end is produced, in part, due to thepresence of the 5′ target sequence, which, upon cleavage of the dsnucleic acid molecule containing the 5′ target sequence, is converted toa 5′ nucleotide sequence that allows directionally linking the dsnucleic acid molecule to a second ds nucleic acid molecule.

According to a method of the invention, the first ds nucleic acidmolecule, second ds nucleic acid molecule and topoisomerase arecontacted under conditions that allow topoisomerase activity, i.e., suchthat the topoisomerase can bind to and cleave the recognition site, toproduce a topoisomerase-charged 3′ terminus, and can ligate the 3′terminus to an appropriate 5′ terminus. Such conditions also allowhybridization of the portion of the portion of the first 5′ targetsequence that remains following cleavage by the topoisomerase and the 5′nucleotide sequence of the second ds nucleic acid molecule that iscomplementary to that portion of the 5′ target sequence.

In performing a method of the invention, a precursor ds nucleic acidmolecule can be combined in the same reaction vessel and at the sametime with the topoisomerase and the second ds nucleic acid moleculebefore the precursor ds nucleic acid molecule is converted into atopoisomerase-charged ds nucleic acid molecule that can be directionallylinked to a second ds nucleic acid molecule. By combining thetopoisomerase in the same reaction vessel as the precursor ds nucleicacid and the second nucleic acid, the methods of the present inventionare simplified. Alternatively, a first precursor ds nucleic acidmolecule can be combined with topoisomerase under conditions that allowtopoisomerase cleavage and binding, then a second ds nucleic acidmolecule can be added.

A precursor ds nucleic acid molecule can be linear or circular,including supercoiled, and, as a result of cleavage by one or moretopoisomerases and, if desired a restriction endonuclease, a lineartopoisomerase-charged first ds nucleic acid molecule is produced. Forexample, a circular ds nucleic acid molecule containing two type IBtopoisomerase recognition sites within about 100 nucleotides of eachother and in the complementary strands, preferably within about twentynucleotides of each other and in the complementary strands, can becontacted with a site specific type IB topoisomerase such that eachstrand is cleaved and the intervening sequence dissociates, therebygenerating a linear ds nucleic acid molecule having a topoisomerasecovalently bound to each end.

In general, a topoisomerase-charged double stranded nucleic acid, whichcan be directionally linked to a second or other ds nucleic acidmolecule, is generated by contacting topoisomerase with a precursor dsnucleic acid that has at least one topoisomerase recognition site at ornear one end and a first target sequence. As used herein, the term“topoisomerase recognition site” means a defined nucleotide sequencethat is recognized and bound by a site specific topoisomerase. Forexample, the nucleotide sequence 5′-(C/T)CCTT-3′ is a topoisomeraserecognition site that is bound specifically by most poxvirustopoisomerases, including Vaccinia virus DNA topoisomerase I, which thencan cleave the strand after the 3′-most thymidine of the recognitionsite to produce a nucleic acid molecule comprising5′-(C/T)CCTT-PO₄-TOPO, i.e., a complex of the topoisomerase covalentlybound to the 3′ phosphate through a tyrosine residue in thetopoisomerase (see Shuman, J. Biol. Chem. 266:11372-11379, 1991;Sekiguchi and Shuman, Nucl. Acids Res. 22:5360-5365, 1994; each of whichis incorporated herein by reference; see, also, U.S. Pat. No. 5,766,891;PCT/US95/16099; PCT/US98/12372).

An advantage of constructing a precursor ds nucleic acid molecule tocomprise, for example, a type IB topoisomerase recognition site about 2to 15 nucleotides (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 12, 14, 15, or 20nucleotides) from one or both ends is that a 5′ overhang is generatedfollowing cleavage of the ds nucleic acid molecule by a site specifictopoisomerase. Such a 5′ overhanging sequence, which would contain 2 to20 nucleotides, respectively, can be designed using a PCR method asdisclosed herein to have any sequence as desired. Thus, where a cleavedfirst ds nucleic acid molecule is to be directionally linked to aselected second (or other) ds nucleic acid molecule according to amethod of the invention, and where the selected sequence has a 5′overhanging sequence, the 5′ overhang on the first ds nucleic acidmolecule can be designed to be complementary to the 5′ overhang on theselected second (or other) ds sequence such that the two (or more)sequences are directionally linked in a predetermined orientation due tothe complementarity of the 5′ overhangs. As discussed above, similarmethods can be utilized with respect to 3′ overhanging sequencesgenerated upon cleavage by a type IA or type II topoisomerase.

As used herein, the term “cleavage product,” when used in reference to atopoisomerase recognition site, refers to a nucleic acid molecule thathas been cleaved by a topoisomerase, generally at its recognition site,and comprises a complex of the topoisomerase covalently bound, in thecase of a type IB topoisomerase to the 3′ phosphate group of the 3′terminal nucleotide in the topoisomerase recognition site, or, in thecase of type IA or type II topoisomerase, to the 5′ phosphate group ofthe 5′ terminal nucleotide in the topoisomerase recognition site. Such acomplex, which comprises a topoisomerase cleaved ds nucleic acidmolecule having the topoisomerase covalently bound thereto, is referredto herein as a “topoisomerase-activated” or a “topoisomerase-charged”nucleic acid molecule. Topoisomerase-activated ds nucleic acid moleculescan be used in a method of the invention, as can ds nucleic acidmolecules that contain an uncleaved topoisomerase recognition site and atopoisomerase, wherein the topoisomerase can cleave the ds nucleic acidmolecule at the recognition site and become covalently bound thereto.

As will be readily apparent from the present disclosure, the ends of dsnucleic acid molecules to be linked according to a method of theinvention can have various characteristics. For example, in one aspect,the second end of a first precursor ds nucleic acid is a blunt end uponcleavage by the topoisomerase, and the second end of a second ds nucleicacid molecule is a blunt end. In another aspect, the second end of afirst precursor ds nucleic acid molecule has a 3′ thymidine extensionupon cleavage by the topoisomerase, and the second end of a second dsnucleic acid molecule has a 3′ adenosine overhang, or the termini cancomprise 3′ adenosine and 3′ deoxyuridine overhangs (see U.S. Pat. Nos.5,487,993 and 5,856,144, each of which is incorporated herein byreference). In yet another aspect, a first precursor ds nucleic acidmolecule has a second 5′ target sequence at the second end, and thesecond end of a second ds nucleic acid molecule has a 5′ nucleotidesequence complementary to at least a portion of the second 5′ targetsequence. The first precursor ds nucleic acid molecule can be a vector,including a cloning vector and an expression vector, and, wheregenerally present in a circular form, can be linearized due to theaction of the topoisomerase, or can be linearized by including, forexample, one or two restriction endonucleases that linearize the vectorsuch that, upon contact with the topoisomerase, the first and second dsnucleic acid molecules can be directionally linked according to a methodof the invention.

As used herein, the term “at or near,” when used in reference to theproximity of a topoisomerase recognition site to the 3′ (type IB) or 5′(type IA or type II) terminus of a nucleic acid molecule, means that thesite is within about 1 to 100 nucleotides from the 3′ terminus or 5′terminus, respectively, generally within about 1 to 20 nucleotides fromthe terminus, and particularly within about 2 to 12 nucleotides from therespective terminus. An advantage of positioning the topoisomeraserecognition site within about 10 to 15 nucleotides of a terminus isthat, upon cleavage by the topoisomerase, the portion of the sequencedownstream of the cleavage site can spontaneously dissociate from theremaining nucleic acid molecule, which contains the covalently boundtopoisomerase (referred to generally as “suicide cleavage”; see, forexample, Shuman, supra, 1991; Andersen et al., supra, 1991). Where atopoisomerase recognition site is greater than about 12 to 15nucleotides from the terminus, the nucleic acid molecule upstream ordownstream of the cleavage site can be induced to dissociate from theremainder of the sequence by modifying the reaction conditions, forexample, by providing an incubation step at a temperature above themelting temperature of the portion of the duplex including thetopoisomerase cleavage site.

A method of the invention using a first precursor ds nucleic acidmolecule having a 5′ target sequence and a topoisomerase on a first endand a second end, can be performed to directionally or non-directionallylink a first precursor ds nucleic acid molecule to a second ds nucleicacid molecule. The method typically is used to directionally link thefirst precursor ds nucleic acid molecule and the second ds nucleic acidmolecule, and also can be used to non-directionally link the firstprecursor ds nucleic acid molecule and the second ds nucleic acidmolecule. Non-directional linking can be performed, for example, 1)where a second nucleotide sequence is present at or near the 5′ terminusof the second end of the first precursor ds nucleic acid molecule, thatis capable of hybridizing to the 5′ complementary nucleotide sequence ofthe second ds nucleic acid molecule; and 2) where a nucleotide sequenceis present at or near the 5′ terminus of both the first end and thesecond end of the second ds nucleic acid molecule that is capable ofhybridizing to the 5′ target sequences at or near the first end of thefirst precursor ds nucleic acid molecule. In these embodiments, thesecond end of the first precursor ds nucleic acid molecule and thesecond end of the second ds nucleic acid molecule can be either blunt,or include an overhang.

A method for generating a directionally or non-directionally linkedrecombinant nucleic acid molecule also can be performed by contacting atopoisomerase-charged first ds nucleic acid molecule, which has, at afirst end, a first 5′ overhang and a first topoisomerase covalentlybound to the 3′ terminus, and a second ds nucleic acid molecule, whichhas a first blunt end and a second end, wherein the first blunt endincludes a 5′ nucleotide sequence complementary to the first 5′overhang. The method is performed under conditions such that the 5′nucleotide sequence of the first blunt end can selectively hybridize tothe first 5′ overhang, whereby the first topoisomerase can covalentlylink the 3′ terminus of the first end of the first ds nucleic acidmolecule with the 5′ terminus of the first end of the second ds nucleicacid molecule.

Such a method can further include contacting the topoisomerase-chargedfirst ds nucleic acid molecule and the second ds nucleic acid moleculewith a third ds nucleic acid molecule, wherein a first end of the thirdnucleic ds acid molecule has a 5′ overhang and a second topoisomerasecovalently bound at the 3′ terminus, and wherein the second ds nucleicacid molecule has a second blunt end, which includes a 5′ nucleotidesequence complementary to the 5′ overhang of the third nucleic acidmolecule. The contacting can be performed, for example, under conditionssuch that the 5′ nucleotide sequence of the second blunt end of thesecond ds nucleic acid can selectively hybridize to the 5′ overhang ofthe first end of the third ds nucleic acid molecule, whereby the secondtopoisomerase can covalently link the 3′ terminus of the first end ofthe third ds nucleic acid molecule with the 5′ terminus of the secondblunt end of the second ds nucleic acid molecule. The first and secondtopoisomerases can be the same or different and, if desired, the firstor third ds nucleic acid molecules, instead of beingtopoisomerase-charged, can contain a topoisomerase recognition site,wherein the method can further include contacting the reactants with atopoisomerase. A method of the invention can be performed such that thefirst ds nucleic acid molecule is directionally linked to the second dsnucleic acid molecule and, thereafter, the third ds nucleic acidmolecule is directionally or non-directionally linked to the second dsnucleic acid molecule, or all of the reactants can be included togetherat the same time.

A method of the invention provides a means to render an open readingfrom a cDNA or an isolated genomic DNA sequence expressible byoperatively linking one or more expression control elements to theputative coding sequence. Examples of expression control elements usefulin the present invention are disclosed herein and includetranscriptional expression control elements, translational expressioncontrol elements, elements that facilitate the transport or localizationof a nucleic acid molecule or polypeptide in (or out of) a cell,elements that confer a detectable phenotype, and the like.Transcriptional expression control elements include, for example,promoters such as those from cytomegalovirus, Moloney leukemia virus,and herpes virus, as well as those from the genes encoding metallothionein, skeletal actin, phosphoenolpyruvate carboxylase,phosphoglycerate, dihydrofolate reductase, and thymidine kinase, as wellas promoters from viral long terminal repeats (LTRs) such as Roussarcoma virus LTR; enhancers, which can be constitutively active such asan immunoglobulin enhancer, or inducible such as SV40 enhancer; and thelike. For example, a metallothionein promoter is a constitutively activepromoter that also can be induced to a higher level of expression uponexposure to a metal ion such as copper, nickel or cadmium ion. Incomparison, a tetracycline (tet) inducible promoter is an example of apromoter that is induced upon exposure to tetracycline, or atetracycline analog, but otherwise is inactive.

A transcriptional expression control element also can be a tissuespecific expression control element, for example, a muscle cell specificexpression control element, such that expression of an encoded productis restricted to the muscle cells in an individual, or to muscle cellsin a mixed population of cells in culture, for example, an organculture. Muscle cell specific expression control elements including, forexample, the muscle creatine kinase promoter (Sternberg et al., Mol.Cell. Biol. 8:2896-2909, 1988, which is incorporated herein byreference) and the myosin light chain enhancer/promoter (Donoghue etal., Proc. Natl. Acad. Sci., USA 88:5847-5851, 1991, which isincorporated herein by reference) are well known in the art. Othertissue specific promoters, as well as expression control elements onlyexpressed during particular developmental stages of a cell or organismare well known in the art.

Expression control or other elements useful in generating a constructaccording to a method of the invention can be obtained in various ways.In particular, many of the elements are included in commerciallyavailable vectors and can be isolated therefrom and can be modified tocontain a topoisomerase recognition site at one or both ends, forexample, using a PCR method as disclosed herein. In addition, thesequences of or encoding the elements useful herein generally are wellknown and disclosed in publications. In many cases, the elements, forexample, transcriptional and translational expression control elements,as well as cell compartmentalization domains, are relatively shortsequences and, therefore, are amenable to chemical synthesis of theelement or a nucleotide sequence encoding the element. Thus, in oneembodiment, an element comprising a composition of the invention, usefulin generating a recombinant nucleic acid molecule according to a methodof the invention, or included within a kit of the invention, can bechemically synthesized and, if desired, can be synthesized to contain atopoisomerase recognition site at one or both ends of the element and,further, to contain an overhanging sequence following cleavage by a sitespecific topoisomerase.

Where ds nucleic acid molecules are to be directionally linked accordingto a method of the invention, the nucleic acid molecules generally areoperatively linked such that the recombinant nucleic acid molecule thatis generated has a desired structure, performs a desired function,encodes a desired expression product, or the like. As used herein, theterm “operatively linked” means that two or more nucleic acid moleculesare positioned with respect to each other such that they act as a unitto effect a function attributable to one or both sequences or acombination thereof. For example, a nucleic acid molecule containing anopen reading frame can be operatively linked to a promoter such that thepromoter confers its regulatory effect on the open reading framesimilarly to the way in which it would effect expression of an openreading frame that it normally is associated with in a genome in a cell.Similarly, two or more nucleic acid molecules comprising open readingframes can be operatively linked in frame such that, upon transcriptionand translation, a chimeric fusion polypeptide is produced.

A first ds nucleic acid molecule comprising an open reading frame can beamplified using any amplification method, for example, by PCR using aprimer pair, to generate an amplified first ds nucleic acid moleculehaving a 5′ nucleotide sequence complementary to a 5′ overhang of atopoisomerase-charged ds nucleic acid molecule of the present invention.Where both ends of the amplified first ds nucleic acid molecule arecomplements of two overhangs on the topoisomerase-charged ds nucleicacid molecule, the 5′ overhangs are different from each other. Theamplified first ds nucleic acid molecule then can be contacted with thetopoisomerase-charged ds nucleic acid molecule comprising a desiredexpression control element such as a promoter such that the secondnucleic acid molecule is operatively linked to the 5′ end of the codingsequence according to a method of the invention.

Various combinations of components can be used in a method of theinvention. For example, the method can be performed by contacting atopoisomerase-activated first ds nucleic acid molecule; a second dsnucleic acid molecule having a first end and a second end, wherein atthe first end or second end or both, the second nucleic acid moleculehas a hydroxyl group at the 5′ terminus of the same end; and a 5′overhang. Where the 5′ terminus of one or both ends to be linked has a5′ phosphate group, a phosphatase also can be contacted with thecomponents of the reaction mixture. Upon contacting, the phosphatase, ifnecessary, can generate a 5′ hydroxyl group at the same end, and thesecond ds nucleic acid molecule then can be directionally linked to thetopoisomerase-activated first ds nucleic acid molecule. The skilledartisan will recognize other combinations of components useful forperforming a method of the invention.

As used herein, reference to contacting a first nucleic acid moleculeand a second nucleic acid molecule “under conditions such that allcomponents are in contact” means that the reaction conditions areappropriate for a topoisomerase-cleaved end of a first ds nucleic acidmolecule to come into sufficient proximity such that a topoisomerase caneffect its enzymatic activity and covalently link the 3′ terminus of thefirst ds nucleic acid molecule to a 5′ hydroxyl group at the terminus ofa second nucleic acid molecule. Examples of such conditions include, forexample, the reaction temperature, ionic strength, and pH. Additionally,such conditions can be determined empirically or using formulas thatpredict conditions for specific hybridization of nucleic acid molecules,as is well known in the art (see, for example, (Sambrook et al.,Molecular Cloning: A laboratory manual (Cold Spring Harbor LaboratoryPress 1989); Ausubel et al., Current Protocols in Molecular Biology,John Wiley and Sons, Baltimore, Md. (1987, and supplements through1995), each of which is incorporated herein by reference).

As disclosed herein, a PCR method using primers designed to incorporatecomplementary nucleotide sequences at one or both ends of an amplifiedds nucleic acid molecule provides an example of a convenient means forproducing ds nucleic acid molecules useful in a method of the invention.At least one of the primers of a primer pair is designed such that itcomprises, in a 5′ to 3′ orientation, a nucleotide sequencecomplementary to a first overhang on the topoisomerase-charged dsnucleic acid molecule of the present invention. The second primer of thePCR primer pair can be complementary to a desired sequence of thenucleic acid molecule to be amplified, and can comprise a secondcomplementary sequence.

A primer can contain or encode any other sequence of interest,including, for example, a site specific integration recognition sitesuch as an att site, a lox site, or the like, or, as discussed above,can simply be used to introduce a topoisomerase recognition site into ads nucleic acid molecule comprising such a sequence of interest. Arecombinant nucleic acid molecule generated according to a method of theinvention and containing a site specific integration recognition sitesuch as an att site or lox site can be integrated specifically into adesired locus such as into a vector, a gene locus, or the like, thatcontains the required integration site, for example, an att site or loxsite, respectively, and upon contact with the appropriate enzymesrequired for the site specific event, for example, lambda Int and IHFproteins or Cre recombinase, respectively The incorporation, forexample, of attB or attP sequences into a directionally ornon-directionally linked recombinant nucleic acid molecule according toa method of the invention allows for the convenient manipulation of thenucleic acid molecule using the GATEWAY™ Cloning System (InvitrogenCorp., La Jolla Calif.). A first ds nucleic acid molecule used in amethod of the invention can be a linearized vector containing two sitespecific integration sites, for example, an “entry vector” (GATEWAY™Cloning System), and a method of the invention can be used to insert asecond (or other) ds nucleic acid molecule between the site specificintegration sites.

A ds nucleic acid molecule to be used in a method or kit of theinvention can be amplified using any amplification reaction, forexample, using the polymerase chain reaction (PCR), to contain acomplementary nucleotide sequence at a 5′ end. Although exemplified byPCR, other amplification methods also can be used to amplify a nucleicacid molecule such that the amplified nucleic acid molecule has acomplementary sequence at the 5′ terminus of one of its ends. Thecomplementary nucleotide sequence is complementary to the 5′ overhang onthe topoisomerase-charged ds nucleic acid to which the amplified nucleicacid will be ligated. This complementarity facilitates the associationof the nucleic acid molecules in a predetermined orientation, whereuponthey can be linked by topoisomerase according to a method of theinvention.

Amplification primers can be designed to impart particularcharacteristics to a desired ds nucleic acid molecule, for example, a dsnucleic acid molecule that encodes a transcriptional or translationalexpression control element or a coding sequence of interest such as anepitope tag or cell compartmentalization domain. In this aspect, theamplification primers are designed such that, upon amplification, the dsnucleic acid molecule contains a 5′ complementary sequence at one orboth ends, as desired.

Amplification primers also can be used to amplify a directionally linkedrecombinant nucleic acid molecule generated according to a method of theinvention. For example, a method of the invention can generate fromthree ds nucleic acid molecules, including a nucleic acid moleculecomprising a promoter, a nucleic acid molecule comprising a codingsequence, and a nucleic acid molecule comprising a polyadenylationsignal, an expressible recombinant nucleic acid molecule. The generationof the nucleic acid molecule is facilitated by the incorporation ofcomplementary 5′ (or 3′) sequences at the ends of the ds nucleotidessequences to be joined, wherein preferably one of the complementarysequences is an overhang sequence.

As such, by designing a PCR primer pair containing a first primer thatis specific for an overhang of the nucleic acid molecule comprising thepromoter that is upstream from the promoter, and a second primer that isspecific for an overhang of the nucleic acid molecule comprising thepolyadenylation signal that is down stream of the signal, only a fulllength functional recombinant nucleic molecule containing the promoter,coding sequence and polyadenylation signal in the correct(predetermined) orientation will be amplified. In particular, partialreaction products, for example, containing only a promoter linked to thecoding sequence, and reaction products containing nicks are notamplified. Thus, PCR can be used to specifically design a ds nucleicacid molecule such that it is useful in a method of the invention, andto selectively amplify only those reaction products having the desiredcomponents and characteristics.

A method of the invention can be performed such that a second ds nucleicacid molecule to be directionally ligated to a first ds nucleic acidmolecule, is one of a plurality of second ds nucleic acid molecules. Asused herein, the term “plurality,” when used in reference a first or atleast a second nucleic acid molecule, means that the nucleic acidmolecules are related but different. For purposes of the presentinvention, the nucleic acid molecules of a plurality are “related” inthat each nucleic acid molecule in the plurality contains, for example,a 5′ nucleotide sequence that is complementary to a 5′ overhang sequencepresent on a topoisomerase-charged ds nucleic acid molecule to which thesecond ds nucleic acid molecules are to be directionally linked.Furthermore, the nucleic acid molecules of a plurality are “different”in that they can comprise, for example, a cDNA library, a combinatoriallibrary of nucleic acid molecules, a variegated population of nucleicacid molecules, or the like. Methods of making cDNA libraries,combinatorial libraries, libraries comprising variegated populations ofnucleic acid molecules, and the like are well known in the art (see, forexample, U.S. Pat. No. 5,837,500; U.S. Pat. No. 5,622,699; U.S. Pat. No.5,206,347; Scott and Smith, Science 249:386-390, 1992; Markland et al.,Gene 109:13-19, 1991; O'Connell et al., Proc. Natl. Acad. Sci., USA93:5883-5887, 1996; Tuerk and Gold, Science 249:505-510, 1990; Gold etal., Ann. Rev. Biochem. 64:763-797, 1995; each of which is incorporatedherein by reference).

Where a second ds nucleic acid molecule is one of a population of dsnucleic acid molecules, a method of the invention can further utilize apopulation of first ds nucleic acid molecules, each of which contains adifferent and randomly generated nucleotide sequence at or near a 3′and/or 5′ terminus of a first and/or second end, for example, randomlygenerated 3′ or 5′ overhangs at or near a first end. Such a populationof randomly generated nucleotide sequences near an end will includecomplementary sequences to nucleotide sequences at or near the ends ofmany or all of the second ds nucleic acid molecules of the plurality.Thus, the method can be used to generate linked recombinant nucleic acidmolecules, including many or all of the nucleic acid molecules in theplurality of second ds nucleic acid molecules.

The methods of the invention have broad application to the field ofmolecular biology. As discussed in greater detail below, the methods ofthe invention can be used, for example, to label DNA or RNA probes, togenerate sense or antisense RNA, to prepare bait or prey constructs forperforming a two hybrid assay, to prepare linear expression elements, toprepare constructs useful for coupled in vitro transcription/translationassays, and to perform directional cloning.

A directionally or non-directionally linked recombinant nucleic acidmolecule generated according to this aspect of the invention can belinear, but preferably is circular, particularly a vector, as describedabove. The circular recombinant nucleic acid molecule can be generatedsuch that it has the characteristics of a vector, and contains, forexample, expression control elements required for replication in aprokaryotic host cell, a eukaryotic host cell, or both, and can containa nucleotide sequence encoding a polypeptide that confers antibioticresistance or the like.

A method of the invention can further include introducing adirectionally or non-directionally linked recombinant nucleic acidmolecule into a cell, which can be a prokaryotic cell such as abacterium or a eukaryotic cell such as a mammalian cell. Accordingly,the present invention also provides a cell produced by a method of theinvention, as well as a non-human transgenic organism produced from sucha cell. An advantage of such a method is that the generated recombinantnucleic acid molecule, which is circularized according to a method ofthe invention, can be transformed or transfected into an appropriatehost cell, wherein the construct is amplified. Thus, an in vivo methodusing a host cell can be used for obtaining a large amount of acircularized product generated according to a method of the invention.

It should be recognized that a method of the invention is characterized,in part, in that a linked recombinant nucleic acid molecule generatedthereby either contains a nick in one strand, because a topoisomerase isattached to only one 3′terminus of the ends to be linked, or comprisesone strand that is derived completely from only one of two nucleic acidmolecule linked according to the method. Where the recombinant nucleicacid molecule contains a nick, the nick can be converted to aphosphodiester bond, if desired, for example, by contacting the nickedrecombinant nucleic acid molecule with a DNA ligase, by introducing thenicked recombinant nucleic acid molecule into a cell such as a bacteriumthat can repair the nick, or by any other method as desired. Thus, inone embodiment, a method of the invention includes a strand invasionevent and a ligation.

Where a recombinant nucleic acid molecule generated according to amethod of the invention does not comprise one strand that is derivedcompletely from only one of the starting nucleic acid molecules, themethod can further include a cleavage step, wherein the displacednucleotide sequence is cleaved from the product. Such a cleaving stepcan be performed using any method known to cleave or degrade a singlestranded nucleotide sequence, including, for example, contacting arecombinant nucleic acid molecule comprising the displaced strand withan enzyme having, 5′ to 3′ or 3′ to 5′ single stranded nucleic acidexonuclease activity (depending on the orientation of the displacedstrand). Such a method conveniently can be performed in vitro.Alternatively, the recombinant ds nucleic acid molecule can beintroduced into a cell, for example an E. coli cell, wherein thedisplaced nucleotide sequence is cleaved.

A method of the invention can be used to generate a directionally linkedrecombinant nucleic acid molecule encoding a chimeric fusionpolypeptide. For generating such a recombinant nucleic acid molecule, afirst and second (or other) ds nucleic acid molecule each can encode allor a portion of an open reading frame, and the first and second (orother) ds nucleic acid molecules, which have first and/or second ends asdisclosed herein, are directionally linked. The chimeric polypeptide cancomprise a fusion polypeptide, in which the two (or more) encodedpeptides (or polypeptides) are translated into a single product, i.e.,the peptides are covalently linked through a peptide bond.

For example, a first ds nucleic acid molecule can encode a cellcompartmentalization domain, such as a plasma membrane localizationdomain, a nuclear localization signal, a mitochondrial membranelocalization signal, an endoplasmic reticulum localization signal, orthe like, or a protein transduction domain such as the humanimmunodeficiency virus TAT protein transduction domain, which canfacilitate translocation of a peptide linked thereto into a cell (seeSchwarze et al., Science 285:1569-1572, 1999; Derossi et al., J. Biol.Chem. 271:18188, 1996; Hancock et al., EMBO J. 10:4033-4039, 1991; Busset al., Mol. Cell. Biol. 8:3960-3963, 1988; U.S. Pat. No. 5,776,689 eachof which is incorporated herein by reference). Such a domain can beuseful to target a fusion polypeptide comprising the domain and apolypeptide encoded by a second ds nucleic acid molecule, to which it isdirectionally linked according to a method of the invention, to aparticular compartment in the cell, or for secretion from or entry intoa cell. As such, the invention provides a means to generatedirectionally linked recombinant nucleic acid molecules that encode achimeric polypeptide.

A fusion polypeptide expressed from a directionally linked recombinantnucleic acid molecule generated according to a method of the inventionalso can comprise a peptide having the characteristic of a detectablelabel or a tag such that the expressed fusion polypeptide can bedetected, isolated, or the like. For example, a first, second or otherds nucleic acid molecule containing a topoisomerase recognition site, orcleavage product thereof, as disclosed herein, can encode an enzyme suchas alkaline phosphatase, θ-galactosidase, chloramphenicolacetyltransferase, luciferase, or other enzyme; or can encode a peptidetag such as a polyhistidine sequence (e.g., hexahistidine), a V5epitope, a c-myc epitope; a hemagglutinin A epitope, a FLAG epitope, orthe like. Expression of a fusion polypeptide comprising a detectablelabel can be detected using the appropriate reagent, for example, bydetecting light emission upon addition of luciferin to a fusionpolypeptide comprising luciferase, or by detecting binding of nickel ionto a fusion polypeptide comprising a polyhistidine tag.

A polyhistidine tag can comprise from about two to about ten contiguoushistidine residues (e.g., two, three, four, five, six, seven, eight,nine, or ten contiguous histidine residues). The tag can also be apeptide tag which binds nickel ions, as well as other metal ions (e.g.,copper ion), and can be used for metal chelate affinity chromatography.Examples of such tags include peptides having the formula:R₁-(His-X)_(n)—R₂, wherein (His-X)_(n) represents a metal chelatingpeptide and n is a number between two through ten (e.g., two, three,four, five, six, seven, eight, nine, or ten), and X is an amino acidselected from the group consisting of alanine, arginine, aspartic acid,asparagine, cysteine, glutamic acid, glutamine, glycine, histidine,iso-leucine, leucine, lysine, methionine, phenylalanine, proline,serine, threonine, tryptophan, tyrosine and valine. Further, R2 may be apolypeptide which is covalently linked to the metal chelating peptideand R1 may be either a hydrogen or one or more (e.g., one, two, three,four, five, six, seven, eight, nine, ten, twenty, thirty, fifty, sixty,etc.) amino acid residues. In addition, R1 may be a polypeptide which iscovalently linked to the metal chelating peptide and R2 may be either ahydrogen or one or more (e.g., one, two, three, four, five, six, seven,eight, nine, ten, twenty, thirty, fifty, sixty, etc.) amino acidresidues. Tags of this nature are described in U.S. Pat. No. 5,594,115,the entire disclosure of which is incorporated herein by reference.

Similarly, isolation of a fusion polypeptide comprising a tag can beperformed, for example, by passing a fusion polypeptide comprising a mycepitope over a column having an anti-c-myc epitope antibody boundthereto, then eluting the bound fusion polypeptide, or by passing afusion polypeptide comprising a polyhistidine tag over a nickel ion orcobalt ion affinity column and eluting the bound fusion polypeptide.Methods for detecting or isolating such fusion polypeptides will be wellknown to those in the art, based on the selected detectable label or tag(see, for example, Hopp et al., BioTechnology 6:1204, 1988; U.S. Pat.No. 5,011,912; each of which is incorporated herein by reference).

In one embodiment, the directionally linked recombinant nucleic acidmolecules encode chimeric polypeptides useful for performing a twohybrid assay. In such a method, the first ds nucleic acid moleculeencodes a polypeptide, or a relevant domain thereof, that is suspectedof having or being examined for the ability to interact specificallywith one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.) otherpolypeptides. The second ds nucleic acid molecule, to which the first dsnucleic acid molecule is to be directionally linked according to amethod of the invention, can encode a transcription activation domain ora DNA binding domain. For example, a first ds nucleic acid molecules tobe directionally linked is modified, for example, to contain a 5′overhang on a first end and a topoisomerase recognition site, orcleavage product thereof, at or near the first end. A second ds nucleicacid molecules to be linked contains, or is modified to contain, a 5′sequence complementary to the 5′ overhang at the first end of the firstds nucleic acid molecule. Upon contact of the first and second dsnucleic acid molecules with a topoisomerase, the directionally linkednucleic acid molecule encodes a first hybrid useful for performing a twohybrid assay (see, for example, Fields and Song, Nature 340:245-246,1989; U.S. Pat. No. 5,283,173; Fearon et al., Proc. Natl. Acad. Sci.,USA 89:7958-7962, 1992; Chien et al., Proc. Natl. Acad. Sci. USA88:9578-9582, 1991; Young, Biol. Reprod. 58:302-311 (1998), each ofwhich is incorporated herein by reference). Similar methods are used togenerate the second hybrid protein, which can comprise a plurality ofpolypeptides to be tested for the ability to interact with thepolypeptide, or domain thereof, of the first hybrid protein. Suchmethods similarly can be used to construct directionally linked nucleicacid molecules encoding fusion protein useful for a modified form of atwo hybrid assay such as the reverse two hybrid assay (Leanna andHannink, Nucl. Acids Res. 24:3341-3347, 1996, which is incorporatedherein by reference), the repressed transactivator system (U.S. Pat. No.5,885,779, which is incorporated herein by reference), the proteinrecruitment system (U.S. Pat. No. 5,776,689, which is incorporatedherein by reference), and the like.

As disclosed herein, a second ds nucleic acid molecule can be one of aplurality of nucleic acid molecules, for example, a cDNA library, acombinatorial library of nucleic acid molecules, or a population ofvariegated nucleic acid molecules. As such, the methods of the inventionare particularly useful for generating recombinant polynucleotidesencoding chimeric polypeptides for performing a high throughput twohybrid assay for identifying protein-protein interactions that occuramong populations of polypeptides (see U.S. Pat. No. 6,057,101 and U.S.Pat. No. 6,083,693, each of which is incorporated herein by reference).In such a method, each of the hybrid proteins of the two hybrid assay isgenerated using a different one of two populations (pluralities) ofnucleic acid molecules encoding polypeptides, each plurality having acomplexity of from a few related but different nucleic acid molecules toas high as tens of thousands of such molecules. By performing a methodof the invention, for example, using a PCR primer pair to amplify eachnucleic acid molecule in a plurality, directionally linked recombinantpolynucleotides encoding a population of chimeric bait polypeptides anda population of chimeric prey polypeptides readily can be generated.Such populations are generated by contacting the amplified pluralitiesof nucleic acid molecules, each of which comprises an appropriate end,with a topoisomerase and a nucleic acid molecule which contains atopoisomerase recognition site at or near its ends and encodes atranscription activation domain or a DNA binding domain.

A first ds nucleic acid molecule useful in a method of the inventionalso can encode a ribonucleic acid (RNA) molecule, which can function,for example, as a riboprobe, an antisense nucleic acid molecule, aribozyme, or a triplexing nucleic acid molecule, or can be used in an invitro translation reaction, and the second ds nucleic acid molecule canencode an expression control element useful for expressing an RNA fromthe first nucleic acid molecule. For example, where it is desired toproduce a large amount of RNA, a second ds nucleic acid moleculecomponent for performing a method of the invention can comprise an RNApolymerase promoter such as a T7, T3 or SP6 RNA polymerase promoter.Where the RNA molecule is to be expressed in a cell, for example, anantisense molecule to be expressed in a mammalian cell, the second (orother) ds nucleic acid molecule can include a promoter that is active ina mammalian cell, particularly a tissue specific promoter, which isactive only in a target cell. Furthermore, where the RNA molecule is tobe translated, for example, in a coupled in vitrotranscription/translation reaction, the first nucleic acid molecule orsecond (or other) nucleic acid molecule can contain appropriatetranslational expression control elements.

A directionally or non-directionally linked recombinant nucleic acidmolecule generated according to a method of the invention can be usedfor various purposes for which recombinant vectors containing adirectionally or non-directionally inserted nucleic acid molecule aregenerally used. Thus, the directionally or non-directionally linkednucleic acid molecule can be used, for example, for expressing apolypeptide in a cell, for diagnosing or treating a pathologiccondition, or the like. For administration to a living subject, thedirectionally or non-directionally linked recombinant nucleic acidmolecule generally is formulated in a pharmaceutical compositionsuitable for administration to the subject. Thus, the invention providespharmaceutical compositions containing a directionally ornon-directionally linked recombinant nucleic acid molecule generatedaccording to a method of the invention and expression products of thisnucleic acid molecule. As such, the nucleic acid molecule is useful as amedicament for treating a subject suffering from a pathologicalcondition.

Pharmaceutically acceptable carriers are well known in the art andinclude, for example, aqueous solutions such as water or physiologicallybuffered saline or other solvents or vehicles such as glycols, glycerol,oils such as olive oil or injectable organic esters. A pharmaceuticallyacceptable carrier can contain physiologically acceptable compounds thatact, for example, to stabilize or to increase the absorption of theconjugate. Such physiologically acceptable compounds include, forexample, carbohydrates, such as glucose, sucrose or dextrans,antioxidants, such as ascorbic acid or glutathione, chelating agents,low molecular weight proteins or other stabilizers or excipients. Oneskilled in the art would know that the choice of a pharmaceuticallyacceptable carrier, including a physiologically acceptable compound,depends, for example, on the route of administration of the composition,which can be, for example, orally or parenterally such as intravenously,and by injection, intubation, or other such method known in the art. Thepharmaceutical composition also can contain a second reagent such as adiagnostic reagent, nutritional substance, toxin, or therapeutic agent,for example, a cancer chemotherapeutic agent.

The directionally linked recombinant nucleic acid molecule can beincorporated within an encapsulating material such as into anoil-in-water emulsion, a microemulsion, micelle, mixed micelle,liposome, microsphere or other polymer matrix (see, for example,Gregoriadis, Liposome Technology, Vol. 1 (CRC Press, Boca Raton, Fla.1984); Fraley, et al., Trends Biochem. Sci., 6:77 (1981), each of whichis incorporated herein by reference). Liposomes, for example, whichconsist of phospholipids or other lipids, are nontoxic, physiologicallyacceptable and metabolizable carriers that are relatively simple to makeand administer. “Stealth” liposomes (see, for example, U.S. Pat. Nos.5,882,679; 5,395,619; and 5,225,212, each of which is incorporatedherein by reference) are an example of such encapsulating materialsparticularly useful for preparing a pharmaceutical composition, andother “masked” liposomes similarly can be used, such liposomes extendingthe time that a nucleic acid molecule remains in the circulation.Cationic liposomes, for example, also can be modified with specificreceptors or ligands (Morishita et al., J. Clin. Invest., 91:2580-2585(1993), which is incorporated herein by reference). The nucleic acidmolecule also can be introduced into a cell by complexing it with anadenovirus-polylysine complex (see, for example, Michael et al., J.Biol. Chem. 268:6866-6869 (1993), which is incorporated herein byreference). Such compositions can be particularly useful for introducinga nucleic acid molecule into a cell in vivo or in vitro, including exvivo, wherein the cell containing the nucleic acid molecule isadministered back to the subject (see U.S. Pat. No. 5,399,346, which isincorporated herein by reference). A nucleic acid molecule generatedaccording to a method of the invention also can be introduced into acell using a biolistic method (see, for example, Sykes and Johnston,supra, 1999).

As disclosed herein, a directionally linked nucleic acid moleculegenerated according to a method of the invention contains a nick, whichcan be resolved, for example, by contacting the nicked recombinantnucleic acid molecule with a ligase. Such a directionally linkedrecombinant nucleic acid molecule that is covalently linked in bothstrands can be used as a template for an amplification reaction such asPCR. As such, a large amount of the construct can be generated.Furthermore, an amplification reaction can provide an in vitro selectionmethod for obtaining only a desired product, without obtaining partialreaction products. For example, a method of the invention can be used togenerate a directionally linked recombinant nucleic acid moleculecomprising, operatively linked in a 5′ to 3′ orientation, a first dsnucleic acid molecule comprising a promoter, a second ds nucleic acidmolecule comprising a coding region, and a third ds nucleic acidmolecule comprising a polyadenylation signal, wherein the nicks in thegenerated recombinant nucleic acid molecule are ligated.

By selecting a PCR primer pair including a first primer complementary toa nucleotide sequence upstream of the promoter sequence, and a secondprimer complementary to a nucleotide sequence downstream of thepolyadenylation signal, a functional amplification product comprisingthe promoter, coding region and polyadenylation signal can be generated.In contrast, partial reaction products that lack either the first dsnucleic acid molecule or third ds nucleotide are not amplified becauseeither the first or second primer, respectively, will not hybridize tothe partial product. In addition, a construct lacking the second dsnucleic acid molecule would not be generated due to the lack ofcomplementarity of the 5′ overhanging sequences of the first and thirdds nucleic acid molecules. As such, the invention provides, in part, ameans to obtain a desired functional, directionally linked recombinantnucleic acid molecule.

The use of an amplification reaction such as PCR in such a mannerfurther provides a means to screen a large number of nucleic acidmolecules generated according to a method of the invention in order toidentify constructs of interest. Since methods for utilizing PCR inautomated high throughput analyses are routine and well known, it willbe recognized that the methods of the invention can be readily adaptedto use in a high throughput system. Using such a system, a large numberof constructs can be screened in parallel, and partial or incompletereaction products can be identified and disposed of, thereby preventinga waste of time and expense that would otherwise be required tocharacterize the constructs or examine the functionality of theconstructs in further experiments.

Recombinant nucleic acid molecules generated by a method of theinvention wherein the first ds nucleic acid molecule contains a firsttopoisomerase but not a second topoisomerase or topoisomerase bindingsite, are generally linear, whereas, in other aspects, the methods ofthe invention can generate circular recombinant nucleic acid molecules.However, a directionally linked recombinant nucleic acid molecule thatis generated as a linear molecule can be circularized, for example,where it is to be used as a vector. In addition, a linear directionallylinked recombinant nucleic acid molecule generated by a method of theinvention can be cloned into a vector, which can be a plasmid vector ora viral vector such as a bacteriophage, baculovirus, retrovirus,lentivirus, adenovirus, Vaccisiia virus, semliki forest virus andadeno-associated virus vector, all of which are well known and can bepurchased from commercial sources (Invitrogen Corp., La Jolla Calif.;Promega, Madison Wis.; Stratagene, La Jolla Calif.; GIBCO/BRL,Gaithersburg Md.).

The methods of the invention also can be used to detectably label anucleic acid molecule with a chemical or small organic or inorganicmoiety such that the nucleic acid molecule is useful as a probe. Forexample, a first ds nucleic acid molecule, which has a topoisomeraserecognition site, or cleavage product thereof, at a 3′ terminus, canhave bound thereto a detectable moiety such as a biotin, which can bedetected using avidin or streptavidin, a fluorescent compound (e.g.,Cy3, Cy5, Fam, fluorescein, or rhodamine), a radionuclide (e.g.,sulfur-35, technicium-99, phosphorus-32, or tritium), a paramagneticspin label (e.g., carbon-13), a chemiluminescent compound, an epitope,for example a peptide epitope, which can be detected using an antibodythat recognizes the epitope, or the like, such that, upon generating adirectionally linked double stranded recombinant nucleic acid moleculeaccording to a method of the invention, the generated nucleic acidmolecule will be labeled. Methods of detectably labeling a nucleic acidmolecule with such moieties are well known in the art (see, for example,Hermanson, “Bioconjugate Techniques” (Academic Press 1996), which isincorporated herein by reference). It should be recognized that suchelements as disclosed herein or otherwise known in the art, includingnucleic acid molecules encoding cell compartmentalization domains, ordetectable labels or tags, or comprising transcriptional or translationexpression control elements can be useful components of a kit asdisclosed herein.

A method of the invention, in which a first ds nucleic acid moleculewith a first topoisomerase, but not a second topoisomerase ortopoisomerase recognition site, is used can be particularly useful forgenerating an expressible recombinant nucleic acid molecule that can beinserted in a site specific manner into a target DNA sequence. Thetarget DNA sequence can be any DNA sequence, particularly a genomic DNAsequence, and preferably a gene for which some or all of the nucleotidesequence is known. The method can be performed utilizing a first dsnucleic acid molecule, which has a first end and a second end andencodes a polypeptide, for example, a selectable marker, wherein thefirst ds nucleic acid molecule comprises a complementary sequence atboth ends; and directionally linking the first ds nucleic acid moleculeto first and second PCR amplification products, which are generated fromsequences upstream and downstream of the site at which the construct isto be inserted, wherein each amplification products each contain atopoisomerase recognition site and a 5′ target sequence selected basedon the factors set forth in the present disclosure. For example, thefirst and second amplification products have different 5′ targetsequences such that, upon cleavage by a topoisomerase; each can belinked to a predetermined end of the first ds nucleic acid molecule.

The first and second amplification products are generated using two setsof PCR primer pairs. The two sets of PCR primer pairs are selected suchthat, in the presence of an appropriate polymerase such as Taqpolymerase and a template comprising the sequences to be amplified, theprimers amplify portions of a target DNA sequence that are upstream ofand adjacent to, and downstream of and adjacent to, the site forinsertion of the selectable marker. In addition, the sets of PCR primerpairs are designed such that the amplification products contain atopoisomerase recognition site and, following cleavage by the sitespecific topoisomerase, a 5′ overhanging sequence at the end to bedirectionally linked to the selectable marker. As such, the first PCRprimer pair includes 1) a first primer, which comprises, in anorientation from 5′ to 3′, a nucleotide sequence complementary to a 5′complementary sequence of the end of the selectable marker to which theamplification product is to be directionally linked, a nucleotidesequence complementary to a topoisomerase recognition site, and anucleotide sequence complementary to a 3′ sequence of a target DNAsequence upstream of the insertion site; and 2) a second primer, whichcomprises a nucleotide sequence of the target genomic DNA upstream ofthe 3′ sequence to which the first primer is complementary, i.e.,downstream of the insertion site. The second PCR primer pair includes 1)a first primer, which comprises, from 5′ to 3′, a nucleotide sequencecomplementary to the 5′ complementary sequence of the end of theselectable marker to which it is to be directionally linked, anucleotide sequence complementary to a topoisomerase recognition site,and a nucleotide sequence of a 5′ sequence of a target DNA sequence,wherein the 5′ sequence of the target genomic DNA is downstream of the3′ sequence of the target DNA sequence to which the first primer of thefirst PCR primer pair is complementary; and the second primer of thesecond primer pair comprises a nucleotide sequence complementary to a 3′sequence of the target DNA sequence that is downstream of the 5′sequence of the target genomic DNA contained in the first primer. Theskilled artisan will recognize that the sequences of the primer that arecomplementary to the target genomic DNA are selected based on thesequence of the target DNA.

Upon contact of the first ds nucleic acid molecule comprising theselectable marker, the first and second amplification products (i.e.,second and third ds nucleic acid molecules), and a topoisomerase (unlessthe molecules are topoisomerase-charged), a directionally linkedrecombinant nucleic acid molecule is generated. Following ligation ofthe nicks, the generated recombinant nucleic acid molecule can befurther amplified, if desired, using PCR primers that are specific foran upstream and downstream sequence of the target genomic DNA, thusensuring that only functional constructs are amplified. Such a generateddirectionally linked recombinant nucleic acid molecule is useful forperforming homologous recombination in a genome, for example, toknock-out the function of a gene in a cell, or to confer a novelphenotype on the cell containing the generated recombinant nucleic acidmolecule. The method can further be used to produce a transgenicnon-human organism having the generated recombinant nucleic acidmolecule stably maintained in its genome.

A method of the invention involving a first ds nucleic acid having atopoisomerase or topoisomerase-recognition site, for example, at a firstend but not the second end, also can be useful for covalently linking anadapter or linker sequence to one or both ends of a second ds nucleicacid molecule of interest, including to each of a plurality of second(or other) ds nucleic acid molecules. For example, where it is desiredto put linkers on both ends of a first ds nucleic acid molecule, themethod can be performed by contacting a topoisomerase with a first dsnucleic acid molecule, which has a 5′ complementary sequence at or neareach 5′ terminus that is complementary to an overhang sequence on a 5′terminus of each of the second and third ds nucleic acid molecules; anda second ds nucleic acid molecule and a third ds nucleic acid molecule,each of which includes a topoisomerase recognition site at theappropriate 3′ terminus and a 5′ overhang sequence at or near the endcontaining the topoisomerase recognition site. An appropriate terminusis the terminus to which the linker is to be directionally linked to thefirst ds nucleic acid molecule. In performing such a method, the linkersequences comprising the second and at least third nucleic acid moleculecan be the same or different.

A method of the invention involving a first ds nucleic acid moleculewith a 5′ target sequence and a topoisomerase on a first end, can beperformed to directionally or non-directionally link a first ds nucleicacid molecule to at least a second ds nucleic acid molecule. The methodtypically is used to directionally link the first ds nucleic acidmolecule and the second ds nucleic acid molecule. However, the methodcan be used to non-directionally link the first ds nucleic acid moleculeand the second ds nucleic acid molecule in the following embodiments: 1)Where a second nucleotide sequence is present at or near the 5′ terminusof the second end of the first ds nucleic acid molecule, that is capableof hybridizing to the 5′ complementary nucleotide sequence at the secondend of the second ds nucleic acid molecule; and 2) Where a nucleotidesequence is present at or near the 5′ terminus of both the first end andthe second end of the second ds nucleic acid molecule that is capable ofhybridizing to the 5′ overhang at the first end of the first ds nucleicacid molecule. In these embodiments involving non-directional linking,the second end of the first ds nucleic acid molecule and the second endof the second ds nucleic acid molecule can be either blunt, or includean overhang.

In embodiments involving a linking a third ds nucleic acid molecule tothe second ds nucleic acid molecule, the methods can be used todirectionally or non-directionally link the two nucleic acid molecules.The method typically is used to directionally link the second ds nucleicacid molecule and the third ds nucleic acid molecule. However, themethod can be used to non-directionally link the third ds nucleic acidmolecule and the second ds nucleic acid molecule in the followingembodiments: 1) Where a second nucleotide sequence is present at or nearthe 5′ terminus of the second end of the third ds nucleic acid molecule,that is capable of hybridizing to the 5′ complementary nucleotidesequence at the second end of the second ds nucleic acid molecule; and2) Where a nucleotide sequence is present at or near the 5′ terminus ofboth the first end and the second end of the second ds nucleic acidmolecule that is capable of hybridizing to the 5′ overhang at the firstend of the third ds nucleic acid molecule. In these embodimentsinvolving non-directional linking, the second end of the third dsnucleic acid molecule and the second end of the second ds nucleic acidmolecule can be either blunt, or include an overhang.

The present invention also provides a composition, which includes afirst ds nucleic acid molecule having a first end and a second end,wherein the first end has a 5′ overhang and a topoisomerase covalentlybound at the 3′ terminus; and a second ds nucleic acid molecule having afirst blunt end and a second end, wherein the first blunt end has afirst 5′ nucleotide sequence, which is complementary to the first5′-overhang, and a first 3′ nucleotide sequence complementary to thefirst 5′ nucleotide sequence. In such a composition, the first 5′nucleotide sequence of the first blunt end of the second ds nucleic acidmolecule can be hybridized to the first 5′ overhang of the first end ofthe first nucleic acid molecule, wherein the first 3′ nucleotidesequence of the first blunt end of the second ds nucleic acid moleculedisplaced. The first ds nucleic acid molecule in such a composition canfurther have a second 5′ overhang at the second end, and the second endof the second ds nucleic acid molecule can further include a second 5′nucleotide sequence, which is complementary to the second 5′ overhang,and a second 3′ nucleotide sequence complementary to the second 5′nucleotide sequence.

The present invention also provides kits, which contain one or more(e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.) reagents useful fordirectionally or non-directionally linking ds nucleic acid molecules. Inone embodiment, a kit of the invention contains a ds nucleic acidmolecule having a first end and a second end, wherein the first endcontains a first 5′ overhang and a first topoisomerase covalently boundat the 3′ terminus, and the second end contains a second topoisomerasecovalently bound at the 3′ terminus and contains a second 5′ overhang, ablunt end, a 3′ uridine overhang, or a 3′ thymidine overhang, whereinthe first 5′ overhang is different from the second 5′ overhang. Thetopoisomerases, which can be the same or different, also can be acomponent of the kit. The ds nucleic acid molecule in the kit can, butneed not be a vector, and can contain one or more (e.g., 1, 2, 3, 4, 5,6, 7, 8, 9, 10, etc.) expression control elements, as well asinstructions for using kit components.

A kit of the invention also can include a plurality of second ds nucleicacid molecules, wherein each ds nucleic acid molecule in the pluralityhas a first blunt end, and wherein the first blunt end includes a 5′nucleotide sequence complementary to the first 5′ overhang of the firstds nucleic acid molecule. The second ds nucleic acid molecules in theplurality can be a plurality of transcriptional regulatory elements,translational regulatory elements, or a combination thereof, or canencode a plurality of peptides such as peptide tags, cellcompartmentalization domains, and the like.

A ds nucleic acid molecule component of a kit can be, for example, alinearized vector such as a cloning vector or expression vector. Ifdesired, such a kit can contain a plurality of ds nucleic acidmolecules, each comprising a different expression control element orother element such as, but not limited to, a sequence encoding a tag orother detectable molecule or a cell compartmentalization domain. Thedifferent elements can be different types of a particular expressioncontrol element, for example, constitutive or inducible promoters ortissue specific promoters, or can be different types of elementsincluding, for example, transcriptional and translational expressioncontrol elements, epitope tags, and the like. Such ds nucleic acidmolecules may be topoisomerase-activated or can be activated withtopoisomerase, and contain 5′ overhanging sequences, or sequences thatbecome 5′ overhanging sequences after topoisomerase activation. Inaddition, the plurality of ds nucleic acid molecules can have 5′overhanging sequences that are unique to a particular expression controlelement, or that are common to plurality of related expression controlelements, for example, to a plurality of different promoter elements.The 5′ overhanging sequences of ds nucleic acid molecules can bedesigned such that one or more expression control elements contained onthe ds nucleic acid molecule can be operatively directionally linked toprovide a useful function, for example, an element comprising a Kozaksequence and an element comprising a translation start site can havecomplementary 5′ overhangs such that the elements can be operativelydirectionally linked according to a method of the invention.

The invention further provides kits for linking nucleic acid moleculesusing methods described herein. Thus, kits of the invention may compriseone or more components for performing methods described herein. Inparticular embodiments, kits of the invention may comprise one or morecomponent selected from the group consisting of instructions for use ofkits components, one or more buffers, one or more nucleic acid molecules(e.g., one or more nucleic acid molecules having a 5′ overhang, a 3′overhang, a 5′ overhang and a 3′ overhang, two 3′ overhangs, two 5′overhangs, etc.), one or more topoisomerase, one or more ligase, one ormore recombinase, one or more adapter linker for preparing moleculeshaving a 5′ overhang and/or a 3′ overhang, and/or one or more containersin which to perform methods of the invention.

The following examples are intended to illustrate but not limit theinvention.

Example 1

In a preferred embodiment of the present invention, atopoisomerase-charged ds nucleic acid molecule is made by firstobtaining a commercially available cloning vector. One such vector ispUni/V5-His version A, (Invitrogen Corp, Carlsbad, Calif.), a circularsupercoiled vector that contains uniquely designed elements. Theseelements include a BGH polyadenylation sequence to increase mRNAstability in eukaryotic hosts, a T7 transcription termination region, anR6Kγ DNA replication origin and a kanamycin resistance gene and promoterfor antibiotic resistance selection. Additionally, pUni/V5-His version Acontains a multiple cloning site, which is a synthetic DNA sequenceencoding a series of restriction endonuclease recognition sites. Thesesites are engineered for cloning of DNA into a vector at a specificposition. Also within the vector's multiple cloning site is a loxP siteinserted 5′ to the endonuclease recognition sites thereby facilitatingCre recombinase-mediated fusion into a variety of other expressionvectors, (Echo™ Cloning System, Invitrogen Corp., Carlsbad, Calif.). Anoptional C-terminal V5 epitope tag is present for easy detection ofexpressed fusion proteins using an Anti-V5 Antibody An optionalC-terminus polyhistidine (6×His) tag is also present to enable rapidpurification and detection of expressed proteins. A bacterial ribosomalbinding site downstream from the loxP site makes transcriptioninitiation in E. coli possible. Though this combination of elements isspecific for pUni/V5-His version A cloning vector, many similar cloningand expression vectors are commercially available or can be assembledfrom sequences and by methods well known in the art. pUni/V5-His versionA is a 2.2 kb double stranded plasmid (see FIGS. 3 and 5).

Construction of a topoisomerase I charged cloning vector frompUni/V5-His version A is accomplished by endonuclease digestion of thevector, followed by complementary annealing of syntheticoligonucleotides and site-specific cleavage of the heteroduplex byVaccinia topoisomerase I. SacI and EcoRI are two of the many restrictionendonuclease sites present within the multiple cloning site ofpUni/V5-His version A, (See FIG. 3). Digestion of pUni/V5-His version Awith the corresponding restriction enzymes, SacI and EcoRI will leavecohesive ends on the vector (5′-AGCT-3′ and 5′-AATT-3′, FIG. 6). Theseenzymes are readily available from numerous vendors including NewEngland Biolabs, (Beverly, Mass. Catalogue Nos. RO156S, Sac and RO101S,EcoRI). The digested pUni/V5-His version A is easily separated from thedigested fragments using isopropanol precipitation. These and othermethods for digesting and isolating DNA are well known to those skilledin the art, (Sambrook et al., (1989) Molecular Cloning, A LaboratoryManual. Second edition. Cold Spring Harbor Laboratory Press; pages5.28-5.32.)

The purified, digested vector is then incubated with two specificoligonucleotide adapters and T4 DNA ligase. The adapters areoligonucleotide duplexes containing ends that are compatible with theSacI and EcoRI ends of the vector. One of skill in the art will readilyappreciate that other adapter oligonucleotides with appropriatesequences can be made for other vectors having different restrictionsites. Following incubation with T4 DNA ligase, the vector containingthe ligated adapters is purified using isopropanol.

The adapter duplex that results from the annealing of TOPO D1 and TOPOD2 has a single-stranded EcoRI overhang at one end and a 12 nucleotidesingle stranded overhang at the other end.

The first adapter oligonucleotide, (TOPO D1), has complementation to theEcoRI cohesive end, 3′-TTAA-5′. Furthermore, TOPO D1 has an additional24 bp including the topoisomerase consensus pentapyrimidine element5′-CCCTT located 16 bp upstream of the 3′ end. The remaining sequenceand size of TOPO D1 adapter oligo is variable, and can be modified tofit a researcher's particular needs. In the current embodiment5′-AATTGATCCCTTCACCGACATAGTACAG-3′ (SEQ ID NO:5) is the full sequence ofthe adapter used.

The second adapter oligonucleotide, (TOPO D2), must have fullcomplementation to TOPO D1. TOPO D2 complements directly 5′ of the EcoRIcohesive flap, extending the bottom strand of the linearized vector.Additionally, TOPO D2 contains the sequence 3′-GTGG, which is the targetsequence, and single-stranded overhang after topoisomerase cleavage, fordirectional cloning. In this embodiment, the single stranded overhangwas chosen to complement the Kozak sequence known to help expression ofORFs in eukaryotic cells by increasing the efficiency of ribosomebinding on the mRNA, however, sequence and length are highly variable tomeet the specific needs of individual users. The complete sequence ofTOPO D2 is 3′-CTAGGGAAGTGG-5′ (SEQ ID NO:6).

Similar to above, the adapter duplex that results from the annealing ofoligonucleotides TOPO D4 and TOPO D5 has a single-stranded SacI overhangat one end, and a 12 nucleotide single-stranded overhang at the otherend.

The third adapter oligonucleotide, (TOPO D5), has complementation to theSac cohesive end, 3′-TCGA-5′. Similar to TOPO D1, TOPO D5 has additionalbases creating a single stranded overhang. The length and sequence canvary based on the needs of the user. In the current embodiment TOPO D5'ssequence is 5′-AAGGGCGAGCT-3′ (SEQ ID NO:7).

The fourth adapter oligonucleotide, (TOPO D4), has full complementationto TOPO D5, and complements directly 5′ of the SacI cohesive flapextending the top strand of the linearized vector. TOPO D4 also containsthe topoisomerase consensus sequence 5′-CCCTT. The remaining sequenceand size of TOPO D4 adapter oligo is variable and can be modified to fita researcher's particular needs. In the current embodiment, the sequenceof TOPO D4 is 3′-GACATGATACAGTTCCCGC-5′ (SEQ ID NO:8), which includes anadditional 12 bp single stranded overhang.

These adapter oligonucleotides can be chemically synthesized using anyof numerous techniques, including the phosphoramadite method (Carutherset al., Meth. Enzymol. 154:287-313, 1987). This and other methods forthe chemical synthesis of oligos are well known to those of ordinaryskill in the art.

Complementary annealing of the purified digested vector and the adapteroligonucleotides is done by incubation of the DNA in the presence of T4DNA ligase. Typical ligation reactions are performed by incubation of acloning vector with suitable DNA fragments in the presence of ligase andan appropriate reaction buffer. Buffers for ligation reactions shouldcontain ATP to provide energy to for the reaction, as well as, reducingreagents like dithiothreitol and pH stabilizers like Tris-HCl. The ratioof concentrations for the cloning vector and the DNA fragments aredependent on each individual reaction, and formulae for theirdetermination are abundant in the literature, (See e.g., Protocols andApplications Guide (1991), Promega Corporation, Madison, Wis., p. 45).T4 Ligase will catalyze the formation of a phosphodiester bond betweenadjacent 5′-phosphates and 3′-hydroxyl termini during the incubation.Cohesive end ligation can generally be accomplished in 30 minutes at12-15° C., while blunt end ligation requires 4-16 hours at roomtemperature, (Ausubel et al., (1992) Second Edition; Short Protocols inMolecular Biology, John Wiley & Sons, Inc., New York, N.Y., pages3.14-3.37), however parameter range varies for each experiment. In thecurrent embodiment, purified, digested pUni/V5-His version A and theadapter oligos were incubated in the presence of T4 ligase and asuitable buffer for sixteen hours at 12.5° C. The resulting linearizedand adapted vector comprises the purified cloning vector attached to theadapter oligonucleotides through base pair complementation and T4ligase-catalyzed, phosphodiester bonds (see FIG. 7).

Efficient modification of the adapted vector with topoisomerase requiresthe addition of an annealing oligo to generate double stranded DNA onTOPO D1's and TOPO D4's single stranded overhangs. Vacciniatopoisomerase I initially binds non-covalently to double stranded DNA.The enzyme then diffuses along the duplex until locating and covalentlyattaching to the consensus pentapyrimidine sequence 5′-CCCTT, formingthe topoisomerase adapted complex, (See Shuman et al., U.S. Pat. No.5,766,891). Modification of the adapted vector takes place in theabsence of DNA ligase to prevent the formation of phosphodiester bondsbetween the adapted vector and the annealing oligo, since phosphodiesterbonds in the non-scissile strand will prevent the dissociation of theleaving group upon cleavage, (FIGS. 8 and 9).

The annealing oligonucleotide, (TOPO D3), must have complementation tothe single stranded DNA overhangs of TOPO D I and TOPO D4. In thecurrent embodiment the overhangs both share the following sequence,5′-GACATAGTACAG-3′ (SEQ ID NO:9). Therefore, TOPO D3 has the followingsequence, 3′-CTGTATCATGTCAAC-5′ (SEQ ID NO:10), which comprises fullcomplementation to the adapter oligos' single stranded overhang and anadditional 3 bp overhang, 3′-AAC-5′.

Incubation of the adapted vector with the annealing oligo in thepresence of topoisomerase will create double stranded DNA to whichtopoisomerase can non-covalently bind, (FIG. 10). Bound topoisomerasewill search the double stranded DNA by a facilitated diffusionmechanism, until the 5′-CCCTT recognition motif is located. Cleavage ofthe phosphodiester backbone of the scissile strand 3′ of the motif iscatalyzed via a nucleophilic attack on the 3′ phosphorous atom of thepreferred oligonucleotide cleavage sequence 5′-CCCTT↓ resulting incovalent attachment of the DNA to the enzyme by a 3′-phosphotyrosyllinkage, (See Shuman et al., (1989) Proc. Natl. Acad. Sci. U.S.A. 86,9793-9796). Cleavage of the scissile strand creates a double strandedleaving group comprising the 3′ end adapter oligo, downstream from the5′-CCCTT motif, and the annealing oligo TOPO D3. Although the leavinggroup can religate to the topoisomerase-modified end of the vector via5′ hydroxyl-mediated attack of the phosphotyrosyl linkage, this reactionis disfavored when the leaving group is no longer covalently attached tothe vector. The addition of T4 polynucleotide kinase and ATP to thecleavage/religation reaction further shifts the equilibrium toward theaccumulation of trapped topoisomerase since the kinase can phosphorylatethe 5′ hydroxyl of the leaving group to prevent the rejoining fromtaking place, (Ausubel et al., (1992) Second Edition; Short Protocols inMolecular Biology, John Wiley & Sons, Inc., New York, N.Y., pp.3.14-3.30). The resulting linearized vector comprises a blunt end fromthe TOPO D4/D3 leaving group and a single stranded overhang bearing endfrom the TOPO D1/D3 leaving group, (FIG. 11). Both of the linearizedcloning vector's ends are charged with topoisomerase, enabling fast,efficient and directional topoisomerase mediated insertion of anacceptor molecule.

Although the above example details the modification of pUni/V5-Hisversion A to form the topoisomerase-modified directional cloning vector,a person of ordinary skill in the art will appreciate how to apply thesemethods to any plasmid, cosmid, virus, or other DNA. It should also benoted that this example demonstrates a vector containing a 5′ singlestranded overhang comprising the sequence 5′-GGTG-3′, however the designof adapter duplexes and annealing oligonucleotides would allow one ofskill in the art to custom design overhangs of any sequence or length atone or both ends of a given vector.

Specifically, any plasmid, cosmid, virus or other DNA can be modified topossess a single stranded overhang of any convenient sequence andlength. These are the basic steps: the vector is first subjected to atreatment that is known to linearize the DNA. Common procedures include,but are not limited to, restriction digestion and treatment withtopoisomerase II. Following linearization, a custom single strandedoverhang is added. In the above example, complementary oligonucleotidesare added to the sticky ends of a restriction digestion giving thedesired single stranded overhang, however single stranded overhangforming oligonucleotides can be added by T4 blunt end ligation, as well.The single stranded overhang sequence is exposed by a topoisomerase Imediated, single strand nicking. In turn, this single stranded overhangcan be used to directionally insert a PCR product comprising one or morecomplimentary nucleotide sequences.

Likewise, topoisomerase modification can be applied to anydouble-stranded plasmid, cosmid, virus or other piece of DNA. Methodsfor the attachment of topoisomerase I to double stranded DNA are wellknown in the art, (See Shuman et al., U.S. Pat. No. 5,766,891). Thestrategic placement of topoisomerase on to a piece of double strandedDNA is determined by the incorporation of a topoisomerase I consensussequence, (See Shuman et al., U.S. Pat. No. 5,766,891). Thetopoisomerase I will bind the double stranded DNA, nick the scissilestrand thus revealing the predetermined single-stranded overhangsequence, and ligate the incoming PCR product in the correct, singlestranded overhang mediated orientation.

Example 2

As an example of the application of the present invention to anotherplasmid, pCR® 2.1, (FIGS. 4 and 12), was modified to create atopoisomerase I adapted vector with a custom single stranded sequence.

The pCR® 2.1 plasmid is 3.9 kb T/A cloning vector. Within the sequenceof this vector are many uniquely designed elements. These elementsinclude an f1 origin, a ColE1 origin, a kanamycin resistance gene, anampicillin resistance gene, a LacZ-alpha fragment and a multiple cloningsequence located within the LacZ-alpha fragment allowing for blue-whiteselection of recombinant plasmids. The multiple cloning sequence, (FIG.4) of the pCR® 2.1 plasmid contains; numerous restriction sites,including but not limited to, HindIII, SpeI and EcoRI; M13 forward andreverse primers and a T7 RNA polymerase promoter.

Construction of the topoisomerase I charged vector possessing a customsingle stranded sequence consists of endonuclease digestion followed bycomplementary annealing of synthetic oligonucleotides and the sitespecific cleavage of the heteroduplex by Vaccinia topoisomerase I.Digestion of the PCR® 2.1 plasmid with the restriction enzymes HindIII,SpeI and EcoRI leaves HindIII and EcoRI cohesive ends on the vector(FIG. 13). The dissociated fragment of pCR® 2.1 downstream from theHindIII cleavage site is further cleaved with SpeT in order to reduceits size. By reducing the size of the fragment, the digested vector iseasily purified away from the smaller digested pieces by isopropanolprecipitation. These enzymes are readily available from numerous vendorsincluding New England Biolabs, (Beverly, Mass., Catalogue Nos.; RO104S,HindIII; RO133S, SpeI; RO101S, EcoRI). Methods for the digestion and theisolation of DNA are well known to those skilled in the art, (Sambrooket al., supra, 1989).

The purified digested vector is incubated with four adapteroligonucleotides and T4 DNA ligase. These adapter oligonucleotides aredesigned to have complementation to either the HindIII cohesive end, theEcoRI cohesive end, or to each other. Following incubation with T4 DNAligase the adapted vector is purified using isopropanol.

The first adapter oligonucleotide, (TOPO H), has complementation to theHindIII cohesive end, 3′-TCGA-5′. Furthermore, TOPO H has an additional24 bp including the topoisomerase consensus pentapyrimidine element5′-CCCTT located 19 bp upstream of the 3′ end. The remaining sequenceand size of TOPO H adapter oligo is variable, and can be modified to fita researcher's particular needs. In the current embodiment5′-AGCTCGCCCTTATTCCGATAGTG-3′ (SEQ ID NO:11) is the full sequence of theadapter used.

The second adapter oligonucleotide, (TOPO 16), must have fullcomplementation to TOPO H. TOPO 16 complements directly 5′ of theHindIII cohesive end, extending the bottom strand of the linearizedvector. Additionally, TOPO 16 contains the sequence 3′-TAAG, which isthe chosen single stranded sequence for directional cloning. Thecomplete sequence of TOPO 16 is 3′-GCGGGAATAAG-5′, (SEQ ID NO:12).

The third adapter oligonucleotide, (TOPO 1), has complementation to theEcoRI cohesive end, 3′-TTAA-5′. Similar to TOPO H, TOPO I has additionalbases containing the topoisomerase I consensus sequence CCCTT located 12bp upstream of the 3′ end. The length and sequence of TOPO I can varybased on the needs of the user. In the current embodiment TOPO l'ssequence is 5′-AATTCGCCCTTATTCCGATAGTG-3′ (SEQ ID NO:13).

The fourth adapter oligonucleotide, (TOPO 2), has full complementationto TOPO 1, and complements directly 5′ of the EcoRI cohesive endextending the top strand of the linearized vector. In the currentembodiment, the sequence of TOPO 2 is 3′-GCGGGAA-5′.

Complementary annealing of the purified digested vector and the adapteroligonucleotides is done by incubation of the DNA in the presence of T4DNA ligase. T4 Ligase will catalyze the formation of a phosphodiesterbond between adjacent 5′-phosphates and 3′-hydroxyl termini during theincubation. In the current embodiment, purified, digested pCR® 2.1 andthe adapter oligos were incubated in the presence of T4 ligase and asuitable buffer for sixteen hours at 12.5° C. The resulting linearizedand adapted vector comprises the purified cloning vector attached to theadapter oligonucleotides through base pair complementation and T4ligase-catalyzed, phosphodiester bonds (FIG. 13). Ligation techniquesare abundant in the literature, (see Ausubel et al., (1992) SecondEdition; Short Protocols in Molecular Biology, John Wiley & Sons, Inc.,New York, N.Y., pp. 3.14-3.37).

Charging of the adapted vector with topoisomerase requires the additionof annealing oligonucleotides to generate double stranded DNA on TOPOH's and TOPO 1's single stranded overhangs. Charging of the adaptedvector takes place in the absence of DNA ligase to prevent the formationof phosphodiester bonds between the adapted vector and the annealingoligo, since phosphodiester bonds in the non-scissile strand willprevent the dissociation of the leaving group upon cleavage (see FIG.9).

The annealing oligonucleotide, (TOPO 17), must have complementation tothe single stranded DNA overhang of TOPO H. In the current embodimentthe overhang has the following sequence, 5′-CGATAGTG-3′. Therefore, TOPO17 has the following sequence, 3′-GCTATCAC-5′, which comprises fullcomplementation to the single stranded overhang of the adapteroligonucleotides.

The annealing oligonucleotide, (TOPO 3), must have complementation tothe single stranded DNA overhang of TOPO 1. In the current embodimentthe overhang has the following sequence, 3′-GTGATAGCCTTA-5′ (SEQ ID NO:14). Therefore, TOPO 3 has the following sequence, 5′-CAACACTATCGGAAT-3′(SEQ ID NO: 15), which comprises full complementation to the adapteroligonucleotide's single stranded overhang and an additional 3 bpoverhang, 5′-CAA-3′.

Incubation of the adapted vector with the annealing oligo in thepresence of topoisomerase will create double stranded DNA to whichtopoisomerase can non-covalently bind, (FIG. 14). Bound topoisomerasewill search the double stranded DNA by a facilitated diffusionmechanism, until the 5′-CCCTT recognition motif is located. Cleavage ofthe phosphodiester backbone of the scissile strand 3′ of the motif willresult in the covalent attachment of the DNA to the enzyme by a3′-phosphotyrosyl linkage, (Shuman et al., Proc. Natl. Acad. Sci. U.S.A.86:9793-9796, 1989). Cleavage of the scissile strand creates a doublestranded leaving group comprising the 3′ end the adapter oligos,downstream from the 5′-CCCTT motif, and the complementary annealingoligonucleotide. The leaving group can religate to the topoisomeraseadapted vector through its 5‘hydroxyl’s attack of the phosphotyrosyllinkage, also catalyzed by topoisomerase. Addition of T4 polynucleotidekinase to the equilibrium reaction prevents the back reaction via thekinase-mediated phosphorylation of the leaving group's 5′ hydroxyl,(Ausubel et al., (1992) Second Edition; Short Protocols in MolecularBiology, John Wiley & Sons, Inc., New York, N.Y., pp. 3.14-3.30). Theresulting linearized vector comprises a blunt end from the TOPO 1/3leaving group and a single stranded sequence end from the TOPO H/17leaving group, (FIG. 15). Both of the linearized cloning vector's endsare charged with topoisomerase, enabling fast, efficient and directionaltopoisomerase mediated insertion of an acceptor molecule.

Directional Cloning According to the Invention.

This invention also provides a method for directional cloning of DNA. Inthe following example, the topoisomerase-charged ds nucleic acid vectoraccording to the present invention constructed from pUni/V5-His versionA was used for the directional insertion of ORFs from the GeneStorm™Expression Ready Clones, (Invitrogen Corp., Carlsbad, Calif.). Themodified pUni vector was selected for the cloning of these ORFs becausethe added target sequence, which becomes a single strand overhang upontopoisomerase cleavage of the vector, has homology to the Kozak sequenceknown to enhance ORF expression. Note, however, that, as before, anyplasmid, cosmid, virus or other DNA could be modified to possess thenecessary single stranded sequence. Likewise, any DNA fragment could bemodified to possess a homologous sequence to any single strandedoverhang of a vector. As a point of interest, the sequence of the singlestranded overhang can effect directional cloning efficiencies. Forexample, single stranded overhangs with low GC content will have lowerannealing stability, also single stranded overhangs that have highcomplementation to both ends of a DNA fragment to be cloned will loosethe capability to direct these DNA inserts. Thus the sequence of asingle stranded overhang should be carefully designed to avoid these andsimilar problems.

Example 3

The present invention is particularly useful in the directionalinsertion of PCR products into vectors constructed according to thepresent invention. In the PCR amplification of the desired insert, thePCR primers are designed so as to complement identified sequences of theinsert(s) that are to be directionally cloned into thetopoisomerase-charged ds nucleic acid vector of the present invention.The primer designed to bind upstream of the DNA's coding strand ismodified with an additional complementary nucleotide sequence on its 5′end. The resulting PCR product will possess a complementary sequenceallowing single stranded overhang mediated directional insertion intothe topoisomerase-charged ds nucleic acid cloning vector of the presentinvention and subsequent expression of the product.

One embodiment comprises introducing to a donor duplex DNA substrate asingle stranded overhang site by PCR amplifying the donor duplex DNAmolecule with the 5′ oligonucleotide primer containing the singlestranded overhang. PCR amplification of a region of DNA is achieved bydesigning oligonucleotide primers that complement a known area outsideof the desired region. In a preferred embodiment the primer that hashomology to the coding strand of the double stranded region of DNA willpossess an additional sequence of nucleotides complementary to thesingle strand overhang of the topoisomerase-charged ds nucleic acidcloning vector of the preset invention.

Using the present invention in a high throughput format, we selectedeighty-two known ORFs from the GeneStorm™ expression system, (InvitrogenCorporation, Carlsbad, Calif.) for directional cloning into thetopoisomerase-charged ds nucleic acid vector of the present invention,however, any sequence of DNA can be selected as desired by individualusers. For each of these ORFs, primers are designed with homology to thecoding and the non-coding strands. To clone PCR products in adirectional fashion into the modified pUni/V5-His version Atopoisomerase-charged ds nucleic acid vector of the present invention asdescribed in example 1, one primer of a given pair was modified tocontain primer of a given pair was modified to contain the nucleotidesequence complementary to the single strand overhang contained withinthe vector. In the current example, the coding primer contained theadded sequence 5′-CACC-3′, which complements the ‘single strandedoverhang’, 3′-GTGG-5′, of the topoisomerase-charged ds nucleic acidcloning vector of the present invention. PCR amplification of the aboveORFs with their respective primers will produce double stranded DNAfragments, which possess the single strand overhang at their 5′ end,(FIG. 16). We used pfu polymerase in our PCR amplification, but it iswell-known that PCR reactions can be performed with either anon-thermophilic polymerase such as pfu or with a thermophilicpolymerase like Taq followed by a blunting step to remove thenon-template nucleotide these enzymes leave at the end of PCR products.

In the present example, 0.1 microgram of each primer was combined with0.05 microgram of DNA containing an ORF in a PCR reaction mix totaling50 microliters total volume. Besides the primers and vector, thereaction mix also contained water, PCR buffer salts, 10 mM dNTPs and1.25 units of pfu polymerase. Thermal cycling temperatures were asfollows; an initial 94° C. denaturation; followed by 25 repetitions of94° C. denaturation, 55° C. primer annealing, and 72° C. elongation,each at one minute; and ended with a 72° C., fifteen minute elongation.However these parameters will vary with each DNA fragment to beamplified. PCR amplification techniques are well known to those skilledin the art, (Ausubel et al., (1992) Second Edition; Short Protocols inMolecular Biology, John Wiley & Sons, Inc., New York, N.Y., pp.15.3-15.4), as are techniques for the conversion of 3′ overhangs toblunt end termini, (Protocols and Applications Guide, Promega Corp.;Madison Wis., pp. 43-44, 1989).

Incubation of the PCR amplified donor duplex DNA containing thecomplementary nucleotide sequence with the modified pUni/V5-His versionA topoisomerase-charged ds nucleic acid vector of the present inventionresults in the directional cloning of the donor DNA. For example, theeighty-two ORFs from the Genestorm™ clone collection (InvitrogenCorporation, Carlsbad, Calif.) were amplified using adapted primerscontaining a complementary nucleotide sequence. Amplification of the 82Genestorm™ ORFs with the described modified primer pairs resulted in PCRproducts that had the complementary nucleotide sequence at their 5′ end.This ORF PCR product is combined with 10 ng of topoisomerase-charged dsnucleic acid cloning vector of the present invention in either sterilewater or a salt solution. The reaction is mixed gently and incubated for5 minutes at room temperature (22-23° C.). After five minutes, we placedthe reaction on ice then proceeded to the One Shoto ChemicalTransformation or Electroporation, (Invitrogen Corporation, Carlsbad,Calif., Catalogue # C4040-10 and C4040-50, respectively), (InvitrogenTOPO Cloning Protocol. Invitrogen Corp.). Topoisomerase had joined theadjacent strands of the vector and the product by catalyzing a rejoiningreaction (FIG. 17). DNA fragments constructed with the complementarynucleotide sequence at their 5′ ends were thus correctly inserted intotopoisomerase-charged ds nucleic acid cloning vectors of the presentinvention with a high efficiency.

Directional insertion of DNA fragments containing 5′ sequencescomplementary into ds nucleic acid cloning vectors according to certainembodiments of the present invention occurs with greater than 90%efficiency as shown by sequencing multiple colonies of transformed hostcells. In the current example, the topoisomerase-charged ds nucleic acidcloning vectors of the present invention containing the GeneStorm™ ORFswere incubated with transformation competent E. coli host cells. Inseventy-four of the transformation reactions, the directional cloning ofthe ORFs into the topoisomerase-charged ds nucleic acid cloning vectorof the present invention occurred in at least seven of the eightcolonies picked, and fifty-nine of these cloning reactions weredirectional in all eight colonies picked. The overall directionalcloning score was 609 of 656, thus, directional insertion was present inover 93% of the clones picked (see Table 1 below).

Example 4

In a similar example, using the above described modified pCR®2.1topoisomerase-charged ds nucleic acid vector of the present invention, aPCR-generated ORF encoding Green Fluorescent Protein (GFP) wasdirectionally cloned in frame with the lacZ α fragment present in thevector (see FIG. 4). The primers used to amplify the GFP gene containedthe requisite complementary nucleotide sequence 5′-ATTC-3′, and theknown sequence for translation initiating methionine, 5′-ATG-3′. Usingthe necessary cloning steps noted above, the PCR amplified GFP wasinserted into the vector and transformed cells were grown on solid Agarplates. Glowing colonies represented a correctly inserted PCR product(see Table 2 below).

These data represent a substantial improvement over the current state ofthe art in cloning, and furthermore present an invention in cloning thatis highly compatible with high throughput techniques. Given directionalcloning efficiencies greater that 90%, a user need only screen twocolonies for each cloned DNA fragment. Thus, on a 96-well plate,forty-eight separate clones can be screened for directional insertion,400% more than current cloning techniques. Use of this invention willstreamline many high throughput gene expression operations, and allowthem to run at fraction of their current costs.

TABLE 1 Directional Cloning of ORFs using a topoisomerase-charged dsnucleic acid cloning vector of the present invention Positive colonies.dPCR reactions Clones tested 8/8 59 7/8 15 6/8 2 5/8 1 4/8 3 3/8 2

TABLE 2 In frame and directional insertion of GFP into modified pCR2.1topoisomerase-charged ds nucleic acid cloning vector of the presentinvention Total white Percentage colonies of (contain a correctrecombinant PCR product's 5′ sequence inserts plasmid)5′-ATTCATG-3′ homologous 86% 457 5′-CAAGATG-3′ non- 35% 118 homologous5′-ATTCGGATG-3′ frame shift 0% 268 VECTOR ONLY 0% 31

Although the invention has been described with reference to the aboveexamples, it will be understood that modifications and variations areencompassed within the spirit and scope of the invention. Accordingly,the invention is limited only by the following claims.

1. A method for generating a directionally linked recombinant nucleicacid molecule, the method comprising contacting: a) atopoisomerase-charged first double stranded (ds) nucleic acid molecule,comprising a first topoisomerase covalently bound at or near a firstend, and a second topoisomerase covalently bound at or near a secondend, said first end further comprising a first 5′ overhang, and saidsecond end further comprising a blunt end, a 3′ thymidine overhang, or asecond 5′ overhang; and b) a second ds nucleic acid molecule, comprisinga first blunt end and a second end, wherein the first blunt endcomprises at its 5′ terminus, a nucleotide sequence complementary to thefirst 5′ overhang, under conditions such that the nucleotide sequencecomplementary to the first 5′ overhang can selectively hybridize to thefirst 5′ overhang, whereby the first topoisomerase can covalently linkthe 3′ terminus of the first end of the first ds nucleic acid moleculewith the 5′ terminus of the first end of the second ds nucleic acidmolecule, and whereby the second topoisomerase can covalently link the3′ terminus of the second end of the first nucleic acid molecule to the5′ terminus of the second end of the second ds nucleic acid molecule,thereby generating a directionally linked nucleic acid molecule.
 2. Themethod of claim 1, wherein the second end of the first ds nucleic acidmolecule comprises a blunt end, and the second end, of the second dsnucleic acid molecule comprises a blunt end.
 3. The method of claim 1,wherein the second end of the topoisomerase-charged first ds nucleicacid molecule comprises a 3′ thymidine overhang, and the second end ofthe second ds nucleic acid molecule comprises a 3′ adenosine overhang.4. The method of claim 1, wherein the topoisomerase-charged, first dsnucleic acid molecule comprises a second 5′ overhang at the second end,and the second ds nucleic acid comprising at the second end, anucleotide sequence complementary to the second 5′ overhang.
 5. Themethod of claim 1, wherein the first ds nucleic acid molecule is avector.
 6. The method of 5, wherein the topoisomerase-charged first dsnucleic acid molecule is a cloning vector.
 7. The method of 6, whereinthe topoisomerase-charged first ds nucleic acid molecule is anexpression vector.
 8. The method of claim 1, further comprisingintroducing the directionally-linked recombinant nucleic acid moleculeinto a cell.
 9. The method of claim 8, wherein the cell is a eukaryoticcell.
 10. The method of claim 9, wherein the cell is a mammalian cell.11. A cell produced by the methods of claim
 8. 12. A transgenicnon-human organism generated from the cell of claim
 11. 13. (canceled)14. The method of claim 1, wherein the second ds nucleic acid moleculecomprises an amplification product.
 15. The method of claim 1, whereinthe topoisomerase is a type IB topoisomerase. 16-55. (canceled)
 56. Acomposition, comprising: a) a first ds nucleic acid molecule comprisinga first end and a second end, wherein the first end comprises a 5′overhang and a topoisomerase covalently bound at the 3′ terminus, and b)a second ds nucleic acid molecule comprising a first blunt end and asecond end, wherein the first blunt end comprises a first 5′ nucleotidesequence, which is complementary to the first 5′-overhang, and a first3′ nucleotide sequence complementary to the first 5′ nucleotidesequence.
 57. The composition of claim 56, wherein the first 5′nucleotide sequence of the first blunt end of the second ds nucleic acidmolecule is hybridized to the first 5′ overhang of the first end of thefirst nucleic acid molecule, and the first 3′ nucleotide sequence of thefirst blunt end of the second ds nucleic acid molecule is displaced. 58.The composition of claim 56, wherein the first ds nucleic acid moleculefurther comprises a second 5′ overhang at the second end, wherein thesecond end of the second ds nucleic acid molecule further comprises asecond 5′ nucleotide sequence, which is complementary to the second 5′overhang, and a second 3′ nucleotide sequence complementary to thesecond 5′ nucleotide sequence. 59-61. (canceled)
 62. A kit, comprisinga) a first double stranded (ds) nucleic acid molecule, which comprises afirst topoisomerase covalently bound at a 3′ terminus of a first end,and a second topoisomerase covalently bound at a 3′ terminus of a secondend, said first end further comprising a first 5′ overhang and saidsecond end further comprising a blunt end, a 3′ thymidine overhang, or asecond 5′ overhang, wherein said first 5′ overhang is different fromsaid second 5′ overhang; and b) a plurality of second ds nucleic acidmolecules, wherein each ds nucleic acid molecule in the pluralitycomprises a first blunt end, and wherein the first blunt end comprises a5′ nucleotide sequence complementary to the first 5′ overhang of thefirst ds nucleic acid molecule.
 63. The kit of claim 62, wherein thesecond ds nucleic acid molecules in the plurality comprisetranscriptional regulatory elements, translational regulatory elements,or a combination thereof.
 64. The kit of claim 62, wherein the second dsnucleic acid molecules in the plurality comprise nucleotide sequencesencoding a peptide.